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FOREWORD 


The  importance  of  census  counts  for  use  (1)  in  Congressional  reapportionment,  (2) 
in  the  allocation  of  funds  and  resources  to  local  areas,  (3)  to  provide  a  base  for  more  cur- 
rent estimates  that  also  are  used  for  allocating  Federal  resources,  and  (4)  to  serve  as  a 
foundation  for  the  planning  and  evaluation  of  private  and  public  programs  has  caused 
increasing  interest  in  the  accuracy  of  census  counts  and  in  the  availability  of  estimates 
of  coverage.  This  is  over  and  above  the  more  obvious  application  of  census  results  as  the 
basis  for  political  redistricting,  now  often  involving  counts  for  areas  as  small  as  city 
blocks.  These  concerns,  and  the  Census  Bureau's  continuing  interest  in  measuring  and 
reporting  on  the  quality  of  the  data  obtained  in  the  decennial  census,  led  to  the  con- 
clusion that  it  was  appropriate  to  take  as  broad  a  perspective  as  possible  on  the  issue 
of  census  adjustment. 

Although  the  Bureau  itself  has  been  active  in  research  concerning  the  undercount,  it 
also  recognized  the  need  to  obtain  the  views  of  the  general  research  and  statistical  com- 
munity. This  is  in  agreement  also  with  the  findings  of  a  National  Academy  of  Sciences 
panel  convened  in  1977  to  examine  the  planning  for  the  1980  census.  The  panel  stressed 
the  importance  of  the  counts,  concluded  that  the  figures  should  be  adjusted,  and  recom- 
mended that  the  Bureau  continue  to  investigate  adequate  technical  means  for  measuring 
the  undercount  and  for  adjusting  the  counts. 

As  a  first  step,  the  Bureau  of  the  Census  convened  a  census  undercount  workshop, 
September  5-8,  1979,  at  Reston,  Va.  The  workshop  participants  included  management 
and  professional  personnel  from  the  Bureau  of  the  Census,  the  Department  of  Commerce, 
and  a  few  additional  participants  familiar  with  the  undercount  issue  and  its  implications. 
The  workshop  was  structured  under  the  guidelines  of  a  decisionmaking  system,  "Strategic 
Assumption  Testing  and  Surfacing  for  Strategic  Management," developed  by  Dr.  Richard  O. 
Mason  of  the  University  of  Southern  California,  and  Dr.  Ian  I.  Mitroff  of  the  University 
of  Pittsburgh. 

The  specific  purpose  of  the  workshop  was  to  determine  whether  or  not  the  discussion 
of  the  undercount  to  date  had  been  sufficiently  comprehensive  to  identify  all  of  the 
issues  and  assumptions  relative  to  undercount  measurement  and  adjustment.  Assumptions 
identified  in  the  workshop  were  examined  in  terms  of  their  relationships  to  other  assump- 
tions and  the  importance  and  degree  of  certainty  that  should  be  attached  to  each. 

Following  the  undercount  workshop,  the  Bureau  sponsored  the  Conference  on  Census 
Undercount  on  February  25-26,  1980,  in  Arlington,  Va.  The  conference  was  designed  to 
provide  a  forum  for  considering  alternative  approaches  to  measuring  the  census  under- 
count and  to  assess  the  implications  of  adjusting  the  census  counts. 

This  volume  contains  the  conference  papers  and  the  discussion  of  the  papers  at  the 
conference.  In  order  to  investigate  as  broad  range  of  concerns  as  possible  at  the  con- 
ference, the  Bureau  undertook  a  general  solicitation  of  papers  on  undercount  issues 
(see  appendix  A).  Papers  were  solicited  on  the  undercount,  in  general,  but  also  on  a  num- 
ber of  specific  concerns: 

Methods  of  measuring  the  undercount  for  subnational  areas,  including  the  quality  of 
the  estimates  of  undercount  in  relation  to  the  size  and  other  characteristics  of  the  area; 
and  the  feasibility  of  providing  accuracy  checks  or  confidence  intervals. 

The  timing  of  the  adjustment(s)  for  undercount. 

iii 


IV 


Measuring  and  adjusting  for  the  undercount  and  misreporting  for  factors  other  than 
total  population,  such  as  social  and  economic  characteristics. 

The  use  of  adjusted  figures  in  Federal  programs  and  the  impact  of  adjustments  on  the 
Federal  statistical  system. 

Political  and  legal  issues  in  making  adjustments  to  the  census  counts. 

The  effects  of  adjustments  on  equity  in  the  distribution  of  Federal  funds. 

Decision  theory  and  theoretical  aspects  of  adjustment. 

Under  the  direction  of  a  Conference  Steering  Committee,  17  papers  were  selected  to 
be  presented  at  the  conference.  In  addition  to  reviewing  the  papers  proposed  for  the 
conference,  the  Steering  Committee  guided  the  general  planning  and  program  for  the 
conference.  The  members  of  the  Steering  Committee  were: 

Conrad  Taeuber,  Chair  Georgetown  University 

William  G.  Cochran  Harvard  University  (Deceased) 

Nathan  Keyfitz  Harvard  University 

Leslie  Kish  University  of  Michigan 

William  H.  Kruskal  University  of  Chicago 

Daniel  8.  Levine  Bureau  of  the  Census 

Evelyn  Mann  City  of  New  York 

Harry  V.  Roberts  University  of  Chicago 

Julian  Samora  University  of  Notre  Dame 

Richard  M.  Scammon  Elections  Research  Center 

Richard  Smolka  American  University 

Bruce  Spencer  Northwestern  University 

Phyllis  A.  Wallace  Massachusetts  Institute  of  Technology 

Eddie  Williams  Joint  Center  for  Political  Studies 

Meyer  Zitter  Bureau  of  the  Census 

We  are  indebted  to  the  Steering  Committee  for  its  work,  to  the  authors  and  dis- 
cussants for  their  careful  attention  to  the  undercount  issues,  and  to  the  conference  par- 
ticipants for  their  observations  and  suggestions. 

Vincent  P.  Barabba 

Director 

U.S.  Bureau  of  the  Census 
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Overview  and 
Summary 


Major  Conference  Findings 


Conrad  Taeuber 

Conference  Chairman 
Georgetown  University 


Although  it  was  not  expected  that  the  conference  partici- 
pants would  reach  unanimity  on  the  issues  examined  at 
the  conference,  some  general  directions  can  be  identified 
in  the  discussion.  They  in  no  way  represent  general  agree- 
ment by  the  participants,  however,  and  should  not  be  con- 
sidered to  be  recommendations  from  the  conference. 

1.  There  would  have  been  ready  assent  among  the  con- 
ference participants  to  any  statement  that  emphasizes 
the  importance  of  obtaining  as  nearly  as  possible  a 
complete  count. 

2.  There  appeared  to  be  general  consensus  that  some 
form  of  adjustment  for  the  undercount  is  needed, 
for  areas  with  concentrations  of  persons  likely  to 
be  missed  may  be  receiving  less  than  their  share  of 
funds  which  are  distributed  in  part  or  wholly  on  the 
basis  of  population. 

3.  There  was  lack  of  agreement  on  the  desirability  of 
making  adjustments  to  the  traditional  census  re- 
porting for  apportionment.  Congressman  Garcia  was 
in  favor  of  adjusted  counts  for  all  uses,  while  Senator 
Moynihan  wanted  no  adjustments  made  for  the  ap- 
portionment of  seats  in  the  House.  Judge  McCullum 
(Alameda  County  Superior  Court)  concluded  in  his 
prepared  remarks  that  only  an  "enumeration"  would 
meet  the  constitutional  requirement  regarding  the 
allocation  of  seats  among  the  States.  Within  the 
States,  and  for  State  legislative  purposes,  there  is  no 
such  limitation.  The  States  are  free  to  develop  their 
own  apportionment  and  redistricting  procedures, 
and  even  to  disregard  the  census  figures  or  estimates, 
if  they  believe  they  have  a  superior  set  of  data.  The 
standard  likely  to  be  applied  is  that  the  alternative 
data  must  be  clear,  cogent,  and  convincing.  Equity 
and  equality  as  mandated  by  the  one-man-one-vote 
rule  would  also  apply,  but  the  courts  have  some 
leeway  in  deciding  what  those  standards  require. 

Judge  McCullum  emphasized  that,  except  for  the 
apportionment  of  House  seats  among  the  States,  ad- 
justed figures  would  be  called  for  and  that,  if  the 
Executive  Branch  did  not  supply  such  figures,  the 
courts  were  likely  to  do  so.  In  that  case,  the  results 
might  be  far  less  acceptable  than  if  the  adjustments 
had  been  made  by  technical  experts.  Here  Keyfitz's 
comment  is  applicable.  He  reported  that  he  had  dis- 
cussed the  question  of  adjusted  figures  with  a  judge 
who  told  him  that  if  confronted  with  the  issue,  the 


court  would  call  on  an  expert.  Keyfitz  had  responded 
that  he  might  be  considered  to  be  such  an  expert  and 
that  he  did  not  have  the  answer. 

4.  There  was  one  strong  statement  arguing  that  no  ad- 
justment should  be  made.  It  was  felt  that  the  presumed 
greater  accuracy  of  adjusted  counts  would  not  be  crit- 
ical to  business  uses.  In  addition,  the  improvement  in 
accuracy  would  not  offset  the  delays  involved  and  the 
confusion  of  "two  sets  of  books."  It  was  also  argued 
that  adjustment  would  beget  adjustment.  That  is, 
once  adjustments  by  age,  sex,  and  race  had  been 
made,  there  would  be  demands  for  characteristics 
of  the  persons  added  as  a  result  of  the  adjustment 
procedure. 

5.  There  appeared  to  be  general  support  for  the  view 
that  if  an  adjustment  were  to  be  made,  it  should  be 
as  simple  as  possible,  something  that  would  not  only 
be  viewed  as  valid  but  could  be  readily  explained. 
This  would  appear  to  lend  support  to  synthetic  esti- 
mates rather  than  regression  estimates  or  other  more 
sophisticated  techniques  for  States  and  other  sub- 
State  areas. 

6.  There  was  some  uncertainty  concerning  the  timetable 
under  which  any  adjustments  might  be  made.  If  full 
reliance  were  to  be  placed  on  demographic  methods 
of  estimating  the  undercount,  the  results  would  be 
available  earlier  than  if  the  results  of  the  postenumer- 
ation  survey  are  to  be  brought  into  the  computations. 

7.  Users  would  probably  be  willing  to  sacrifice  some 
fine  tuning  of  the  estimates  of  the  undercount  if  that 
would  lead  to  a  more  timely  release  of  the  estimates 
and  of  any  adjustments  that  might  be  made. 

8.  There  was  general  agreement  that  the  decision  to 
adjust  or  not  should  be  made  before  the  census 
results  are  available.  If  the  decision  is  made  to  adjust, 
the  announcement  of  that  decision  should  be  ac- 
companied by  a  statement  of  the  procedure  to  be 
used  in  adjustment. 

9.  There  was  little  discussion  of  the  form  in  which 
adjusted  numbers  should  be  released.  There  seemed 
to  be  agreement  that,  if  issued,  they  should  not  re- 
place the  numbers  published  on  the  basis  of  the 
enumeration,  and  should  not  be  carried  into  cross 
tabulations.  It  was  suggested  that  this  might  be  accom- 
plished by  issuing  "adjustment  factors"  rather  than 
adjusted  counts,  leaving  it  to  individual  users  to  make 
use  of  the  adjustment  factors  as  suited  the  needs. 


10.  There  are  special  problems  involved  in  securing  ad- 
justment factors  for  Hispanics  and  other  minority 
groups.  One  suggestion  was  that  the  factor  for  blacks 
should  be  applied  to  Hispanics  and  all  nonwhite 
groups.  Some  Hispanic  spokesmen,  however,  claim 
that  the  undercount  for  their  group  is  greater  than 
that  for  blacks.  The  Bureau  is  on  record  as  speculating 
that  it  falls  somewhere  between  the  factors  for 
whites  and  blacks. 

1 1.  The  subject  of  illegal  aliens  or  undocumented  workers 
was  discussed  as  a  question  that  needs  to  be  recog- 
nized, though  there  was  no  clear  proposal  by  which 
they  might  be  included  in  estimates  of  the  under- 
count. 

12.  It  was  presumed  that  adjustments,  if  any,  would  con- 
tribute to  equity  in  the  distribution  of  funds  and  any 
other  benefits.  Statistical  methods  for  measuring  the 
improvement  in  equity  were  presented.  One  of  the 
authors  repeatedly  stressed  that  improvements  in 
equity  must  be  measured  in  terms  of  achievement 
of  the  legislative  intent. 

13.  A  review  of  the  statistical  needs  of  Federal  agencies 
led  to  the  conclusion  that  the  underreporting  of  in- 
come in  the  census  was  potentially  a  more  difficult 
issue  than  the  undercount  of  population.  It  is  not 
clear  what  the  impact  of  the  incomes  of  the  un- 
counted persons  would  be  on  distributions  or  on 
measures  of  central  tendency.  The  effect  on  poverty 
measures  is  likewise  uncertain. 

14.  Some  reference  was  made  to  the  variety  of  provisions 
in  the  laws  governing  the  distribution  of  funds  from 
the  Federal  Government.  Some  laws  specify  the  most 
recent  census,  others  speak  of  estimates  by  the  De- 
partment of  Commerce,  and  there  are  a  number  of 
variants  of  these.  Attention  may  need  to  be  paid  to 
the  legal  actions  that  may  be  necessary  if  adjusted 


numbers  are  to  be  used.  Interest  was  expressed  in 
the  experience  of  Australia  where  census  results  are 
published  as  collected,  but  postcensal  estimates  of 
population  take  into  account  an  adjustment  for 
underenumeration. 

15.  There  were  repeated  references  to  the  difference  be- 
tween "imputations"  and  "adjustments."  It  was 
pointed  out  that  the  proposed  adjustments  would 
not  be  significantly  different  from  the  procedures 
used  for  1970  when  additions  were  made  to  the 
enumerated  population.  The  postenumeration  post 
office  check  and  the  vacancy  check  in  connection 
with  the  1970  census  were  viewed  as  on  the  thin 
edge.  The  imputations  based  on  these  two  postenu- 
meration checks  were  distributed  at  random  and  the 
characteristics  of  the  imputed  individuals  were  de- 
rived by  statistical  procedures.  The  imputations  made 
on  the  basis  of  the  checks  after  the  enumeration  are 
not  likely  to  recur  in  1980  because  the  mail-out  mail- 
back  procedure  is  nearly  universal,  as  is  the  program 
to  visit  each  unit  that  is  initially  reported  as  vacant. 

16.  There  was  a  call  for  more  and  intensive  research  into 
the  means  of  reducing  the  undercount  as  well  as  into 
appropriate  methods  for  making  adjustments.  Tech- 
niques of  matching  offer  possibilities  that  are  only 
partially  realized  due  to  the  primitive  state  of  the 
methodology.  It  was  suggested  that  far  too  little  use 
has  been  made  of  the  opportunities  created  through 
the  availability  of  administrative  records. 

17.  There  was  a  plea  that  the  data  from  any  postenumer- 
ation analysis  be  made  available  promptly  to  research 
workers  outside  the  Bureau  of  the  Census  for  inde- 
pendent analyses. 

18.  Attention  was  called  to  the  likelihood  that  the  .under- 
count would  lead  to  a  dilution  of  the  strength  of 
liberal  and  big  city  representatives  in  the  House. 


The  Bureau's  Agenda  on  the 
Undercount  Decision 


Vincent  P.  Barabba 

Director 

Bureau  of  the  Census 


How  do  we  plan  to  use  the  comments  from  this  confer- 
ence in  our  decision  process? 

First,  the  process  began  with  the  publication  of  the  1978 
report  of  the  Panel  on  Decennial  Census  Plans  of  the  National 
Academy  of  Sciences  indicating  the  importance  of  census 
undercounts.  Second,  the  Census  Bureau  followed  this 
activity  with  a  workshop  in  September  1979,  which  at- 
tempted to  surface  the  critical  assumptions  related  to  the 
various  adjustment  procedures  discussed  by  the  Panel.  And 
third,  we  have  this  conference,  which  has  attempted  to 
bring  together  the  different  perspectives  of  various  re- 
searchers and  interest  groups.  We  have  heard  discussions  on 
statistical  methods.  We  have  heard  economic  cost/benefit 
analysis.  Social  and  political  considerations  have  been  brought 
up.  And  we  have  heard  emphasis  on  the  need  to  educate  the 
public  and  all  of  its  subsegments,  including  the  Congress. 

We  have  not  attempted  at  this  conference  to  relate  the 
issues  of  these  various  perspectives  to  each  other.  We  intend, 
however,  to  bring  together  this  synthesis  of  issues,  and  here 
is  how  we  plan  to  do  it. 

1.  Within  the  next  2  months  we  will  assimilate  and  report 
on  this  conference  and  obtain  further  comments  on 
the  views  that  were  presented. 

2.  Based  on  the  conference  and  other  inputs,  a  series  of 
working  papers  will  be  prepared  for  general  distribu- 


tion  to   highlight  and  clarify  the   major   elements  of 
interest  and  concern  on  the  adjustment  issue. 

3.  In  September  1980,  we  will  conduct  a  workshop  to 
synthesize  findings  and  discuss  possible  recommenda- 
tions. At  this  workshop,  we  will  deal  with  the  inte- 
grating issues,  some  of  which  are  as  follows: 

•  Do  the  benefits  of  adjusting  for  undercount  out- 
weigh the  cost? 

•  How  do  various  interest  groups  perceive  the  benefits? 

•  What  will  the  law  allow? 

•  What  will  the  political  system  allow? 

•  Can  we  combine  the  greatest  benefits  of  some 
techniques  with  the  greatest  benefits  of  others?  In 
essence,  can  we  develop  a  win/win  situation? 

4.  Following  the  September  workshop,  we  will  officially 
publish  a  document  of  our  findings,  making  explicit 
the  critical  assumptions  that  will  underlie  our  final 
decision. 

5.  In  a  November/December  1980  time  frame,  we  will 
develop  our  final  decision  on  whether— and  if  yes— 
when  and  how  to  adjust  for  undercount,  based  on  all 
available  information,  including  any  preliminary  assess- 
ments as  to  probable  undercount  rates  for  1980.  The 
information  obtained  through  this  conference  will 
have  had  a  very  definite  role  in  shaping  the  decision. 


Background 


Welcome  and  Introduction 


Vincent  P.  Barabba 

Director 
Bureau  of  the  Census 


I  am  very  pleased  to  welcome  all  of  you  to  this  confer- 
ence. I  also  wish  to  express  appreciation  in  advance  to  all  of 
those  who  helped  prepare  for  this  event,  including  the  par- 
ticipants who  have  prepared  papers  which  will  give  us  much 
to  discuss  in  the  next  2  days. 

Less  than  5  weeks  from  today,  a  census  questionnaire  will 
be  delivered  to  virtually  every  household  in  the  United  States. 
After  that,  our  success  will  depend  to  a  great  extent  on  the 
American  people,  and  the  thousands  of  individuals,  cities 
and  towns,  and  organizations  that  are  making  extraordinary 
efforts  to  encourage  cooperation  and  to  achieve  complete 
coverage. 

Nonetheless,  I  doubt  that  we  would  be  here  today  if  we 
were  sure  that  we  would  achieve  100-percent  success.  Some 
degree  of  census  undercount  is  a  sure  companion  to  a  free 
and  mobile  society.  So  we  will  be  focusing  on  what  to  do 
about  census  undercount,  and  whether,  when,  and  how  to 
adjust  for  it. 

The  next  point  is  that  I  doubt  if  we  would  be  here  if  it 
were  February  of  1970  instead  of  1980.  Before  the  last 
census,  concern  about  undercount  was  largely  a  quiet  aca- 
demic exercise.  In  fact,  one  of  our  two  distinguished  con- 
gressional guests  on  the  program  today,  Senator  Moynihan, 
organized  a  conference  13  years  ago  on  very  similar  topics, 
and  the  level  of  interest  was  considerably  lower  than  now. 

The  world  has  changed  very  much  in  a  single  decade. 
There  were,  of  course,  complaints  about  undercount  in  1970 
from  many  communities  after  the  census.  But  it  was  really 
the  dramatic  social  changes  from  the  midsixties  to  the 
midseventies  that  focused  national  attention  on  the  benefits 
of  a  population  census. 

There  were  two  major  developments  in  particular.  First 
there  was  the  civil  rights  movement,  together  with  landmark 
court  cases  concerning  apportionment  and  redistricting.  As 
members  of  minority  groups  gained  access  to  the  political 
process,  the  use  of  census  data  to  determine  equitable 
representation  at  all  levels  of  government  became  an  instru- 
ment of  progress,  and  concern  for  the  adequacy  of  the 
counts  increased  sharply. 

The  other  major  development  was  the  host  of  statutes  in 
this  period  that  moved  Federal  resources  back  to  State  and 
local  governments  to  address  some  of  the  problems  of  edu- 
cation, housing,  and  other  social  needs.  It  is  now  estimated 
that  some  $50  billion  is  distributed  annually  by  law  directly 
or  indirectly  on  the  basis  of  census  data.  These  two  devel- 
opments have  placed  the  question  of  equity  squarely  before 
us,  and  a  key  issue  is  whether,  and  how,  adjustments  for 


missing  data  would  produce  more  equity  in  the  allocation 
of  shares  of  resources. 

Before  I  get  into  the  issues,  let  me  simply  point  out  that 
the  most  important  issue  is  for  the  Census  Bureau  to  make 
every  effort  to  achieve  the  most  complete  count  possible. 
The  Bureau  has  taken  a  number  of  major  steps  in  this  census 
to  achieve  that  goal,  and  we  are  spending  upwards  of  $200 
million  on  better  coverage.  These  are  either  new  steps  or 
vastly  improved  procedures  over  the  last  census.  Among 
others,  they  include  the  following  elements:  We  have  de- 
veloped lists  of  addresses  all  across  the  country.  We  will 
have  several  on-the-ground  checks  of  all  addresses  in  cooper- 
ation with  the  Postal  Service  before  and  during  the  census. 
We  will  have  new  procedures  to  improve  coverage  within  the 
housing  units.  We  have  a  new  program  where  we  have  invited 
the  highest  elected  officials  in  every  city  and  county  to  re- 
view the  preliminary  counts  and  provide  feedback  to  the 
Bureau. 

Just  as  important  as  these  procedural  steps,  we  are  work- 
ing as  never  before  to  convince  people  to  cooperate  with 
the  census.  This  outreach  effort  includes  widespread  adver- 
tising and  public  information  programs  with  the  media, 
private  industry,  public  utilities,  and  other  institutions.  It 
also  includes  a  grass  roots  program  where  more  than  200 
full-time  Census  Bureau  workers  are  explaining  the  census 
to  minority  populations  in  their  communities  and  encourag- 
ing local  organizations  to  do  the  same. 

And  finally,  we  have  for  5  years  been  working  with 
Census  Advisory  Committees  that  represent  the  black, 
Hispanic,  and  Asian  and  Pacific  Island  populations  to  achieve 
the  best  possible  count. 

I  am,  at  the  present  time,  in  my  fourth  career  as  a  person 
deeply  involved  in  the  development  and  use  of  statistics.  Let 
me  identify  briefly  how  my  experiences  in  the  careers  have 
affected  the  way  that  I  approach  the  issue  of  faulty  and 
missing  data. 

In  my  first  career,  I  provided  information  to  decision- 
makers in  political  campaigns.  Relative  to  faulty  or  missing 
data,  we  seldom  had  enough  time,  because  of  deadlines- 
election  days  are  never  postponed— and  very  limited  re- 
sources, to  really  deal  with  the  problem  in  a  meaningful  way. 

My  second  career  was  at  the  Census  Bureau  at  a  time 
when  the  evaluation  of  the  1970  census  was  being  com- 
pleted and  when  the  1980  census  was  being  planned.  Given 
the  interest,  and  the  availability  of  resources,  we  were  able 
to  devote  the  time  and  effort  necessary  to  achieve  consider- 
able  success    in   identifying  missing  and  faulty  data.  As  a 
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result,  the  strengths  and  the  weaknesses  of  the  1970  census 
data  were  made  known  to  many,  and  I  believe  that  in  doing 
so,  we  greatly  improved  their  utility  to  the  users. 

My  third  career  was  in  a  large  private  corporation  where 
we  faced  marketing  campaigns  that  involved  time  pressures 
similar  to  those  I  experienced  in  political  campaigns.  How- 
ever, because  of  our  size,  the  costs  associated  with  our  de- 
cisions were  of  such  magnitude  that  improving  the  data  was 
absolutely  essential.  We  simply  had  to  find  the  time.  The 
resources,  because  of  our  size,  were  available. 

In  one  of  our  market  surveys,  one  focal  point  for  analysis 
was  a  unit  of  measurement  we  labeled  "the  establishment." 
We  found  it  necessary  to  ascign  characteristics  for  unre- 
ported data— in  this  case,  missing  data  items  within  the 
establishment— in  other  to  prodice  a  sound  analysis.  It 
turned  out  that  involving  eventual  users  of  the  data  in 
handling  these  imputations  produced  some  surprising  and 
pleasant  results. 

First,  we  held  discussions  to  establish  which  variables— 
or  data  elements— would  require  imputation.  The  very  fact 
that  we  held  these  discussions  significantly  reduced  the 
number  of  imputed  variables  and  also  focused  the  users' 
thinking  about  the  quality  of  the  data  items.  This  user  con- 
tact helped  us  establish  realistic  expectations  concerning 
the  data  well  in  advance  of  the  results.  The  frankness  of 
the  discussions  also  reduced  the  extent  to  which  the  user 
community  perceived  the  process  of  adjusting  the  data  to 
be  "juggling"  the  figures,  or  "massaging"  or  "fudging"  the 
data. 

Another  advantage  to  holding  the  discussions  was  that  we 
were  able  to  agree  on  the  criteria  to  use  for  making  imputa- 
tions, bringing  to  bear  legitimate  opinions  and  settling  dis- 
putes among  users  prior  to  the  imputation  process.  Further- 
more, the  discussions  highlighted  which  pieces  of  data  were 
really  important  from  a  user  point  of  view.  The  overall  result 
of  the  discussions  was  an  improvement  of  the  imputation 
process  and  at  the  same  time,  a  reduction  in  anxiety  on  the 
part  of  the  users. 

Finally,  using  standard  operating  procedures  developed 
at  the  Census  Bureau,  we  gave  the  users  the  percent  of  im- 
puted data  for  each  item,  which  showed  them  that  nothing 
was  being  hidden  and  also  gave  them  greater  insight  for  use 
in  interpreting  and  using  the  results  of  the  survey. 

I  offer  this  experience  from  my  third  career  because  it 
relates  to  one  of  the  major  issues  I  now  face  in  my  fourth 
career  as  we  attempt  to  conduct  the  1980  census.  This  time, 
I  have  the  responsibility  of  directing  the  implementation  of 
the  census  plan  that  was  begun  during  my  second  career, 
and  one  of  the  major  concerns  in  doing  this  job  is  dealing 
with  any  missing  or  faulty  data.  There  are  a  number  of  as- 
pects when  it  comes  to  missing  or  faulty  data  in  a  census, 
but  I'd  like  to  concentrate  on  the  aspect  of  most  concern 
to  this  conference,  and  that  is  the  problem  of  undercount, 
or  more  specifically,  accounting  for  persons  missing  from 
the  count. 


In  tackling  this  issue  before  we  are  actually  faced  with 
the  new  numbers  next  year,  I  hope  to  follow  the  lead  from 
my  experience  both  in  the  public  sector  and  private  in- 
dustry—that is,  to  get  input  from  involved  parties  prior  to 
any  decision.  This  meeting  is  an  example.  In  the  end,  we 
hope  to  openly  distribute  a  plan  for  making  the  ultimate 
decision  and  a  plan  for  the  implementation  of  adjustment, 
if  the  decision  is  to  adjust,  prior  to  any  actual  decision. 

It  is  my  impression  that  too  often  we  either  do  not  learn 
from  or  use  the  results  of  the  various  commissions  or  con- 
ferences within  which  we  participate.  Because  of  this  con- 
cern, the  Bureau  has  made  a  conscientious  effort  to  thor- 
oughly investigate  the  previous  activities  that  have  been 
directed  toward  this  problem.  This  investigation  has  pro- 
vided us  with  significant  insight  and  direction. 

However,  in  this  investigation,  we  have  found  wanting, 
in  some  instances,  a  challenging  type  of  review  for  some  of 
the  assumptions  which  underlie  the  many  conclusions  that 
have  been  brought  forward  on  this  issue.  For  example,  in  the 
report  of  the  Panel  of  the  National  Academy  of  Sciences, 
the  Panel  concluded  that  some  kind  of  adjustment  for 
undercount  is  feasible  and  that  the  technical  responsibility 
for  procedures  should  rest  with  the  Bureau  of  the  Census. 
The  Panel  also  recommended  that  we  state  publicly,  before 
the  census  date,  the  general  methods  we  would  follow  if 
adjustments  are  to  be  made.  (I  assume,  of  course,  it  will 
not  come  as  a  surprise  to  anyone  in  this  audience  that  I 
do  not  have  that  statement  ready  today.) 

There  are  at  least  two  assumptions  that  underlie  the 
conclusions  of  the  Panel  that  I  would  like  to  use  in  defining 
what  I  mean  by  undergoing  a  challenging  review.  First,  the 
idea  of  stating  a  convention  in  advance,  of  course,  has 
obvious  attractions.  Ideally,  if  everyone  agreed  in  advance 
how  the  numbers  would  be  adjusted  before  seeing  them, 
then  there  would  be  less  reason  to  contest  the  results.  That 
assumes,  of  course,  that  whether  you  are  a  "winner"  or  a 
"loser"  (as  the  result  of  any  adjustment  process,  the  "ac- 
curacy" of  which  can  be  legitimately  questioned),  you  will 
accept  (without  argument)  the  results  of  a  predetermined 
adjustment  process. 

Second,  the  Panel  also  observed  that  making  adjustments 
for  missed  people  could  be  seen  in  principle  as  an  extension 
of  other  techniques  the  Bureau  has  previously  applied  to 
correct  deficiencies  both  in  the  counts  and  reported  charac- 
teristics. However,  the  assumption  of  precedent  having  been 
set  has  been  debated  at  length  within  the  Bureau,  and  it  is 
not  at  all  clear  that  all  of  the  1970  procedures  are  simple 
extensions  of  the  same  principle.  Some  corrections,  for  ex- 
ample, were  simply  the  result  of  correcting  faulty  geography, 
replacing  lost  materials,  or  using  convincing  second-hand 
evidence  of  people  existing  in  a  specified  place— even  though 
we  did  not  interview  them  directly. 

The  nearest  thing  to  an  adjustment  procedure  of  the  un- 
counted was  the  national  vacancy  check,  in  which  classes  of 
people  were  added  to  the  actual  counts  on  the  basis  of  re- 
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visits.  But  this  applied  only  to  a  sample  of  housing  units 
originally  classified  as  vacant.  One  school  of  thought  is  that 
even  this  limited  procedure,  which  was  not  designed  in 
advance  but  developed  during  the  enumeration  to  correct 
defects,  went  beyond  the  narrow  technical  meaning  of  a 
census  enumeration. 

Indeed,  the  Census  Bureau  and  the  Congress  were  suffi- 
ciently concerned  about  adding  people  in  1970  based  on  a 
sample  check  of  vacant  units,  to  develop  1980  plans  costing 
many  millions  of  dollars  to  go  back,  in  1980,  to  all  units 
originally  classified  as  vacant.  Thus,  people  would  be  added 
based  on  complete  field  checks  rather  than,  as  in  1970,  on 
the  basis  of  a  probability  that  a  proportion  of  vacant  units 
were  really  occupied. 

Again,  I  use  these  two  assumptions  not  because  I  disagree 
with  the  conclusion  of  the  National  Academy  Panel  that  it  is 
feasible  to  adjust.  I  mention  them  because  I  want  to  build 
on  the  work  of  that  distinguished  Panel.  We  want  to  accept 
that  which  stands  the  test  of  challenge  from  multiple  per- 
spectives. We  will  then  either  reject  or  modify  that  which 
fails  to  stand  on  its  own.  Of  course,  I  do  not  mean  a  statis- 
tical test.  I  mean  the  test  of  acceptability  by  those  who  will 
be  affected  by  the  decision. 

Let  me  continue  developing  an  operational  definition  of  a 
challenging  test  by  using  the  same  two  assumptions.  Re- 
member now,  the  two  assumptions  are:  First,  those  to  be 
affected  by  an  adjustment  of  the  enumeration  will  agree  on 
an  adjustment  procedure  prior  to  an  assessment  of  whether 
they  will  be  impacted  positively  or  negatively,  and  second, 
the  Bureau's  actions  of  the  past  will  be  acceptable  precedent 
to  justify  current  actions. 

If  you  accept  those  assumptions,  it  seems  to  me  that  you 
will  also  accept  the  following:  A  very  senior  congressional 
delegation  from  State  A  will  accept  without  contest  the 
apportionment  of  the  435th  member  of  the  House  of  Rep- 
resentatives to  State  B  rather  than  their  own  State  on  the 


basis  of  fewer  than  300  people.  They  will  do  so  even  when 
the  final  count  for  the  apportionment  was  adjusted  by  the 
Census  Bureau,  using  a  projection  from  a  revisit  to  a  national 
sample  of  vacant  dwelling  units  so  that  State  B  had  300 
more  people  added  to  its  "count"  than  did  State  A. 

I  bring  this  up  not  to  cast  aspersions  on  previous  adjust- 
ments, nor  do  I  even  assume  that  the  scenario  I  just  described 
could  not  be  accepted.  I  could,  of  course,  raise  similar 
challenges  to  the  assumptions  required  before  the  conclusion 
"not  to  adjust"  could  be  accepted.  The  central  point  is  that  all 
of  the  critical  assumptions  must  be  tested  before  the  Bureau 
makes  its  recommendation. 

I  say  this  at  the  beginning  of  this  conference  because  the 
reason  you  are  here  is  because  you,  as  a  group,  have  been 
selected  for  your  specific  knowledge  and  expertise  and  the 
many  backgrounds  and  interests  that  you  represent.  We  want 
to  hear  not  only  your  ideas,  but  how  you  react  to  the  ideas 
of  others  and  how  they  react  to  yours.  We  will  listen  and  we 
will  listen  carefully,  because  this  conference  is  a  very  im- 
portant element  of  the  process  by  which  we  will  ultimately 
make  our  recommendation. 

Whatever  the  final  decisions  may  be,  our  overall  objective 
is  to  ensure  that  they  are  will-informed,  well-understood, 
and  open.  Because  the  decisions  are  of  such  basic  impor- 
tance, and  the  issues  are  complex,  we  are  going  to  take  a 
bit  more  time  getting  to  them  than  some  would  prefer. 
For  that  reason,  I  ask  for  your  patience  as  well  as  your 
continuing  attention  to  the  process  we  go  through  to  get 
there. 

I  am  strongly  committed  to  keeping  everyone  informed 
on  the  process.  I  am  also  strongly  committed  to  do  what  is 
right  by  the  American  people.  I  hope  that  by  keeping  our 
thought  process  open  to  you,  you  will  accept  the  honesty 
and  sincerity  of  our  effort— even  if  you  don't  like  our 
eventual  conclusion.  After  all,  I  wouldn't  want  my  fifth 
career  to  start  prematurely. 


Census  Undercount:  Time  To  Adjust 


Robert  Garcia 

U.S.  House  of  Representatives 


My  interest  in  the  census  undercount  preceded  my  tenure 
as  the  Chairman  of  the  House  Census  Subcommittee.  As  a 
State  senator  and  the  minority  leader  of  the  New  York  State 
Senate,  I  know  what  redistricting  is  all  about.  I  became  dis- 
tinctly aware  of  the  difficulties  caused  by  the  census  under- 
count. Even  before  then,  I  had  been  aware  of  the  inequities 
resulting  from  the  fact  that  I  came  from  an  area  with  a  large 
number  of  people  who  are  never  recorded  in  the  census.  This 
special  perspective  was  a  main  reason  why  I  sought  a  place  on 
the  House  Census  Subcommittee  and  plan  to  remain  as  its 
chairman  for  as  long  as  the  people  of  the  Bronx  continue  to 
send  me  back  to  Congress. 

I  represent  a  district  with  a  large  black  and  Puerto  Rican 
population— the  very  kind  of  people  who  the  Census  Bureau 
says  are  the  most  frequently  missed. 

Since  becoming  chairman  of  the  House  Census  Subcom- 
mittee, I  have  been  very  much  impressed  to  learn  about  the 
research  the  Bureau  of  the  Census  has  conducted  regarding 
census  errors.  But,  at  the  beginning  of  this  conference,  let  us 
recognize  that  disputes  about  the  accuracy  of  census  results 
occurred  long  before  the  research  began. 

In  fact,  almost  every  census  since  1790  involved  an  under- 
count controversy.  These  were  heated  congressional  debates 
on  the  count  during  the  19th  century.  For  example,  census 
results  were  challenged  by  States  during  the  1830's,  the 
1850's,  and  1860's.  In  1840,  the  American  Statistical  Associ- 
ation issued  a  report  criticizing  the  accuracy  of  census  results. 
After  the  census  of  1870,  Nebraska  decided  to  elect  an  extra 
Member  to  Congress  because  they  thought  that  the  State  had 
been  undercounted.  Congress  considered  their  claim  on  its 
merits,  but  finally  declined  to  seat  the  extra  Member.  On 
three  previous  occasions.  Congress  decided  to  allow  States 
an  extra  seat  because  of  errors  in  the  census.  In  1890,  New 
York  City  demanded  a  recount.  In  1920,  with  each  side 
claiming  that  they  had  been  undercounted,  the  Congress  was 
unable  to  decide  upon  a  bill  to  implement  the  reappor- 
tionment. Disputes  about  census  accuracy  are  certainly 
not  new. 

In  two  respects,  we  face  a  totally  different  situation  today. 
First,  after  the  problems  created  by  the  census  of  1920,  the 
Congress  enacted  legislation  giving  the  Census  Bureau  much 
greater  latitude  in  conducting  the  census  than  it  ever  had 
before.  The  permanent  Census  Bureau,  which  was  estab- 
lished in  1902,  was  now  encouraged  to  increase  the  pro- 
fessional quality  of  its  staff,  which  assumed  a  greater  role  in 
deciding  upon  the  subjects  covered,  the  rates  of  pay,  and 
the  methods  of  census  enumeration.  We  are  very  fortunate 


that  this  occurred  at  a  time  of  great  advances  in  the  sciences 
of  statistics  and  demography.  The  application  of  sampling 
to  the  work  of  the  Census  Bureau,  and  the  development  of  a 
more  refined  notion  of  the  idea  of  census  and  survey  error 
has  laid  the  groundwork  for  the  adjustments  you  will  be 
discussing  over  the  next  2  days. 

Because  there  has  been  some  talk  that  the  undercount 
problem  was  in  part  created  by  the  increased  awareness 
resulting  from  these  studies,  I  want  you  to  know  that  it  is 
my  view  that  these  studies  (especially  the  work  of  Jay 
Siegel)  have  not  decreased  but  rather  increased  the  credi- 
bility of  census  results.  Before  the  studies  were  conducted, 
the  debates  about  the  undercount  were  cast  in  vague  terms 
which  made  the  issue  intractable.  Now,  we  can  base  our 
consideration  of  the  issue  upon  solid  scientific  work— work 
which  is  notable  for  its  self-conscious  attention  to  its  own 
limitations.  This  kind  of  careful  and  professional  approach 
has  increased  the  confidence  the  Congress  has  in  the  Census 
Bureau.  I  am  here  to  urge  you  to  continue  in  that  tradition. 

The  second  respect  in  which  our  situation  differs  from 
that  facing  the  censuses  before  1930  is  that  because  of  all 
the  advances  that  have  been  made  since  then,  we  have  come 
to  rely  upon  census  results  in  our  policy  decisions,  for  plan- 
ning, and,  most  importantly,  in  the  distribution  of  Federal 
benefits.  Even  where  Federal  grants  are  discretionary,  popu- 
lation and  characteristics  information  drawn  from  the  census 
and  the  current  estimates  based  on  the  census  are  an  impor- 
tant  consideration  in  the  decisionmaking  process. 

This  increased  use  of  census  results  has  heightened  our 
concern  that  the  procedures  used  by  the  Bureau  of  the 
Census  should  not  only  aim  to  achieve  the  greatest  amount 
of  overall  accuracy— they  should  go  beyond  this  to  ensure 
that  these  efforts  result  in  an  enumeration  that  is  also 
equitable.  Equity  is  only  achieved  when  the  resources  and 
skills  of  the  professionals  at  the  Census  Bureau  are  used  in 
such  a  way  as  to  be  sure  that  we  do  not  overlook  the  im- 
balances in  the  undercount. 

Frankly,  if  the  2.5-percent  undercount  which  the  Bureau 
estimates  occurred  in  1970  were  evenly  distributed  among 
all  the  places  in  the  Nation  and  among  all  the  different 
groups  of  people— rich  and  poor,  old  and  young,  men  and 
women,  black,  white,  and  Hispanic— and  if  this  undercount 
were  not  so  severely  concentrated  among  the  very  people 
Congress  intended  to  aid  the  most,  it  would  be  of  much 
less  concern.  But  the  fact  is  that  in  1970  the  Bureau  reports 
it  counted  97.5  percent  of  the  population,  but  only  92.3 
percent  of  the  blacks,  and  only  81.5  percent  of  black  men 
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aged  25  to  34  years.  Equity  demands  that  we  address  our- 
selves to  this  disparity. 

I  cannot  let  this  point  pass  without  acknowledging  the 
improvements  that  the  Bureau  has  introduced  into  census- 
taking  procedures.  Many  of  these  improvements  are  aimed 
at  reducing  the  undercount  of  minorities.  Yet  there  are 
limits  to  the  improvements  that  these  efforts  can  achieve. 
During  the  last  year,  our  subcommittee  has  heard  from  many 
technically  trained  witnesses.  One  of  the  most  impressive  was 
Professor  Philip  Hauser.  Professor  Hauser— as  I  am  sure  you 
all  know— was  associated  with  the  Census  Bureau  during 
every  census  since  1930. 

He  was  largely  responsible,  during  his  tenure  as  deputy 
director  and  director,  for  evaluation  procedures  as  an  integral 
part  of  the  decennial  census.  Coming  before  us  in  his  home- 
town of  Chicago,  Professor  Hauser  urged  the  Census  Bureau 
to  use  all  of  the  information  they  have  to  adjust  the  census 
figures  and  redress  the  imbalance  created  by  the  differential 
undercount.  I  am  extremely  impressed  by  his  logic. 

Prior  to  the  census  of  1930,  the  Congress  used  to  act  as 
the  final  arbiter  of  census  accuracy.  In  1929,  it  passed  the 
law  assigning  the  Bureau  the  main  responsibility  for  designing 
census  procedures  and  deciding  upon  the  topics  to  be 
covered  in  the  census.  By  this  act,  the  Congress  made  the 
Census  Bureau  responsible  for  using  its  resources  to  ensure 
that  the  census  results  reflected  all  the  information  they 
had  about  the  size  and  characteristics  of  the  population.  If 
evaluation  studies  show  that  there  are  undercounts  that  fall 
unevenly  on  the  population,  the  Census  Bureau  has  a  duty 
to  devise  procedures— using  the  best  professional  talent  avail- 
able—to make  the  most  use  of  this  information.  This  con- 
gressional power  is  delegated  to  the  Census  Bureau  with  a 
mandate  to  use  the  most  reliable  procedures  available.  Under 
these  circumstances,  I  am  convinced  that  adjusting  the  census 
is  appropriate.  Exactly  how  this  is  done  is  a  matter  for  the 
experts.  That  is  why  you  have  been  called  to  this  conference. 

Including  information  in  the  census  that  was  not  directly 
obtained  from  the  residents  of  a  household  is  certainly  not  a 
new  innovation.  Processing  errors  have  been  corrected  in  this 
way  for  several  decades.  During  the  1970  census,  more  than  a 
million  people  were  added  to  the  census  counts  as  a  result  of 
the  vacancy  recheck  program.  This  adjustment— that  is  the 
only  way  to  refer  to  it— was  based  on  a  sample  survey  of 
vacant  units.  Factors  were  derived  from  this  survey,  and 
whole  households  (together  with  the  people  living  in  them 
and  their  characteristics  inferred  from  the  survey)  were  added 
to  the  census  counts.  These  data  were  used  in  the  apportion- 
ment. Furthermore,  other  adjustments  were  made  in  1970. 

The  Bureau  exercised  its  judgment  and  adjusted  the 
census  figures  to  correct  for  the  fact  that  unprecedented 
numbers  of  Americans  were  overseas  due  to  our  involvement 
in  the  Vietnam  war.  According  to  Census  Bureau  testimony 
presented  before  our  subcommittee  in  1976,  this  correction 
had  the  effect  of  awarding  an  extra  seat  to  Oklahoma  at  the 
expense  of  Connecticut.  The  decision  to   include  overseas 


citizens  of  the  United  States  in  the  count  of  the  States  where 
they  had  ties  was  unprecedented  in  the  annals  of  U.S.  census 
taking.  It  involved  complicated  statistical  procedures.  Ac- 
cording to  the  Bureau  analysis  of  the  procedure,  "the  data 
on  'home  State'  of  those  overseas  are  of  an  unknown  reli- 
ability." The  Census  Bureau  has  announced  that  this  pro- 
cedure will  not  be  used  in  1980.  The  adjustments  being 
discussed  for  the  1980  census  would  be  much  more  reliable 
than  this  adjustment,  which  had  an  impact  on  the  reappor- 
tionment. 

Adjustment  based  on  a  more  comprehensive  methodology 
would  be  no  different  logically  from  the  1970  imputations. 
The  distinction  would  be  that  in  this  case  more  compre- 
hensive information  would  be  used.  The  papers  you  will 
consider  in  the  next  2  days  illustrate  that  adjustment  can  be 
accomplished  through  several  procedures  and  the  impact 
of  each  will  be  different.  Consequently,  it  is  important  to 
arrive  at  a  consensus  as  to  the  most  equitable  and  accurate 
procedure. 

In  spite  of  the  best  efforts  of  the  researchers  working  at 
the  Bureau  of  the  Census  and  the  best  advice  you  can  give 
them,  it  seems  that  the  procedures  available  for  correcting 
the  census  count  are  all  subject  to  limitations  that  arise 
from  the  assumptions  that  must  be  made  before  they  can  be 
implemented.  The  choice  of  a  procedure  depends  upon  the 
kinds  of  assumptions  the  adjuster  is  willing  to  accept.  The 
results  of  the  correction  may  differ  in  important  ways, 
depending  upon  the  procedure  that  is  used.  It  is  not  possible 
to  know  the  direction  or  amount  of  these  differences  until 
the  census  is  completed  and  the  corrections  are  made.  Never- 
theless, I  believe  we  ought  to  agree  to  use  the  data.  For 
example,  several  methods  that  might  be  used  to  adjust  the 
results  of  the  1980  census  suffer  from  the  problem  that 
they  rely  upon  the  assumption  that  the  errors  of  coverage 
present  are  statistically  independent  of  each  other.  In  other 
words,  to  accept  the  results  of  each  of  these  analyses,  we 
must  assume  that  it  is  not  likely  that  persons  missed  in 
one  method  (the  census)  would  be  missed  in  another  (e.g., 
a  postenumeration  survey).  In  fact,  this  assumption  is  di- 
rectly counter  to  the  trends  that  have  been  found.  Persons 
missed  in  the  census  are  more  likely  to  be  missed  in  a  post- 
enumeration  survey,  to  be  left  out  of  administrative  records, 
and  to  be  excluded  from  vital  statistics. 

I  want  to  assure  you  that  the  Congress  will  exercise  very 
careful  oversight  of  the  Census  Bureau's  decision  to  adjust 
so  as  to  be  certain  that  this  decision  results  in  the  most 
equitable  procedure  available.  I  trust  that,  as  usual,  we  will 
be  kept  informed  of  the  likely  alternatives  and  decisions 
that  are  made  so  that  we  will  have  adequate  opportunity  to 
comment. 

Because  all  of  these  issues  raise  difficult  problems,  I  was 
very  pleased  to  review  the  list  of  speakers  at  this  conference. 
For  example,  you  have  invited  leaders  of  census  taking  in 
Canada  and  Australia,  which  have  both  used  census  adjust- 
ment procedures.  Several   of  the  papers  you  will  hear  will 
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present  the  results  of  technical  work  that  can  be  used  in 
preparing  the  most  appropriate  adjustments.  On  behalf  of 
the  House  of  Representatives,  I  would  like  to  express  our 
appreciation  for  all  of  the  work  that  has  gone  into  preparing 
these  studies.  We  recognize  that  there  may  be  no  single, 
simple,  and  completely  adequate  solution  to  the  problem 
you  will  consider.  But  I  would  like  to  urge  you  to  arrive  at 
a  consensus  as  to  the  best  that  can  be  done  and  to  implement 
that  alternative  for  the  benefit  of  the  Nation.  I,  for  one, 
look  forward  with  great  anticipation  to  the  results  of  this 
meeting. 

There  is  one  more  point  that  I  would  like  to  address  to 
the  employees  of  the  Census  Bureau,  since  I  have  so  many  of 
them  here  together  in  the  same  room.  I  have  always  appreci- 
ated and  supported  your  efforts  at  improving  the  census 
procedures.  These  are  extensive  efforts,  but  I  fear  their  im- 
pact will  be  limited.  No  amount  of  resources  will  result  in  a 
complete  enumeration.  I  have  been  very  much  impressed  by 


the  arguments  of  Professor  Nathan  Keyfitz,  who  I  see  is  on 
the  program  this  morning,  that  the  decision  as  to  what 
procedure  should  be  followed  should  be  announced  before 
census  day.  I  believe  that  if  a  census  adjustment  convention 
is  announced  in  advance  (even  at  this  late  date),  it  will  em- 
phasize the  professional  and  technical  grounds  for  the 
decision  and  relieve  some  of  the  distrust  that  might  otherwise 
occur.  It  will  also  promote  the  kind  of  open  governmental 
procedure  our  Nation  needs. 

During  the  last  several  months,  the  members  of  the  sub- 
committee have  worked  to  build  confidence  in  the  census, 
while  we  also  work  to  make  you  aware  of  procedures  that 
need  to  be  improved.  Now,  here  on  the  eve  of  the  enumera- 
tion, I  want  to  assure  you  that  during  a  period  which  is 
bound  to  be  difficult,  I  will  work  closely  with  you  and  your 
staff  to  help  ensure  that  we  have  the  most  accurate  and 
complete  census  possible.  Thank  you  very  much  for  asking 
me  to  be  here. 


The  Census  Bureau 
Experience  and  Plans 


Jacob  S.  Siegel  and  Charles  D.  Jones 

Bureau  of  the  Census 


Since  the  previous  decennial  census  was  taken,  in  1970, 
the  legally  mandated  uses  of  census  data  have  proliferated  to 
the  point  where  the  distribution  of  many  billions  of  dollars 
depends  on  the  outcome  of  the  census.  Consequently,  the 
Bureau  of  the  Census  has  been  subject  to  a  great  deal  of 
public  and  political  pressure  to  produce  an  accurate  count  of 
the  population  and  to  adjust  the  counts  for  any  inaccuracies 
remaining  in  the  data.  In  designing  the  1980  census,  the 
Bureau  has  put  substantial  effort  into  improvingthe  census- 
taking  procedures  over  those  used  in  the  1970  census.  How- 
ever, recognizing  that  no  matter  how  carefully  a  census  is 
planned  and  executed  people  will  still  be  missed,  the  Census 
Bureau  has  been  planning  an  extensive  program  to  evaluate 
the  coverage  of  the  1980  census.  This  document  includes  a 
summary  of  previous  evaluation  programs  and  their  results, 
a  description  of  the  various  techniques  currently  planned  for 
use  in  measuring  the  coverage  of  the  1980  census,  the  plans 
for  combining  the  various  estimates,  as  well  as  a  discussion 
of  the  effects  of  census  errors  on  fund  allocations. 

PREVIOUS  EVALUATION  PROGRAMS  AND 
FINDINGS 

Beginning  with  the  1950  census,  the  Census  Bureau  has 
conducted  systematic  programs  to  evaluate  the  coverage  of 
decennial  censuses.  The  1950  census  evaluation  program 
included  the  first  large  scale  postenumeration  survey  (PES). 
This  survey  found  an  overall  omission  rate  of  1.4  percent. 
However,  subsequent  analysis  of  these  results  showed  that 
the  PES  seriously  understated  the  amount  of  undercoverage. 
In  fact,  demographic  analysis  carried  out  later  indicated  that 
the  probable  undercount  in  1950  was  3.3  percent. 

For  the  1960  census,  demographic  analysis  was  used  to 
produce  official  preferred  estimates  of  coverage.  These 
studies  indicated  that  5.1  million  persons,  or  2.7  percent,  of 
the  population  were  omitted  from  the  census  count.  Of 
these  omissions,  3.2  million  were  whites  (corresponding  to  an 
omission  rate  of  2.0  percent)  and  1 .8  million  were  black  and 
other  races  (or  8.1  percent).  The  1960  evaluation  program 
included  a  number  of  other  studies.  A  PES  was  again  con- 
ducted, but  unresolved  problems  relating  to  matching  and 
correlation  bias  prevented  the  Bureau  from  obtaining  useful 
results.  A  record  check  study  was  also  conducted  that 
matched  sampled  records  from  four  sources— the  1950 
census,  birth  records,  the  1950  PES,  and  alien  address 
records— with  the  1960  census.  A  range  of  estimates  was 
produced,  but  the  one  viewed  as  most  reasonable  showed 


gross  omissions  of  4.0  percent,  corresponding  to  a  net 
underenumeration  of  2.9  percent.  The  record  check  study 
showed  a  geographic  pattern  of  error  rates  consistent  with 
the  1950  PES.  Undercount  rates  were  highest  in  the  South, 
followed  by  the  West,  with  lower  rates  in  the  Northeast  and 
North  Central  States.  The  1960  evaluation  program  also 
included  an  evaluation  of  the  coverage  of  housing  units. 

The  1970  census  evaluation  program  included  a  wide 
range  of  studies.  Demographic  analysis  was  again  the  source 
for  preferred  national  estimates  of  undercount.  According 
to  this  analysis,  the  number  of  persons  missed  increased 
slightly  to  5.3  million, but  the  percentage  undercount  dropped 
to  2.5  percent.  There  was  again  a  marked  difference  in  the 
undercount  rate  for  whites  (1.9  percent)  and  for  blacks 
(7.7  percent),  as  well  as  for  males  (3.3  percent)  and  females 
(1.8  percent).  If  the  races  and  the  sexes  are  taken  together, 
even  greater  differences  were  apparent.  The  omission  rate 
for  black  males  was  9.9  percent,  but  only  2.4  percent  for 
white  males;  5.5  percent  of  black  females  were  missed,  but 
only  1.4  percent  of  white  females.  Omission  rates  were 
especially  high— 18  percent— among  black  males  aged  25 
to  44  years. 

Because  of  the  difficulties  with  the  1950  and  1960  post- 
enumeration  surveys,  no  such  large-scale  survey  was  planned 
for  1970.  However,  a  number  of  other  studies  were  con- 
ducted, including  a  CPS-census  match,  a  Medicare-census 
match,  and  demographic  analysis  applied  to  States.  All  of 
these  provided  some  information  on  the  geographic  distri- 
bution of  the  undercount.  Again,  various  biases  appeared  to 
affect  the  results  of  the  CPS-census  match  study.  The  esti- 
mated undercount  rate  was  2.3  percent  overall,  or  1.8 
percent  for  whites  and  6.3  percent  for  blacks.  Coverage 
appeared  to  be  worst  in  the  South  followed  by  the  North- 
east, West,  and  North  Central  regions,  in  that  order.  The 
1970  CPS-census  match  study  showed  further  that  coverage 
appeared  to  be  better  in  urban  areas  than  rural  and  better 
in  metropolitan  than  nonmetropolitan  areas. 

Information  on  socioeconomic  differentials  in  coverage 
was  also  provided  by  the  1970  CPS-census  match  study. 
Consistent  with  the  results  of  the  1950  PES,  omission  rates 
for  lower  income  families  were  almost  twice  those  of  higher 
income  families;  this  pattern  was  particularly  true  for  whites, 
but  the  differences  were  not  very  great  for  blacks.  Higher 
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omission  rates  overall  were  found  for  unemployed  workers 
than  employed  workers,  but  the  reverse  was  true  for  blacks. 
For  both  races,  omission  rates  were  higher  for  persons  with 
lower  education. 

The  1970  CPS-census  match  study  also  produced  valuable 
information  relating  coverage  of  population  and  households 
that  was  used  in  planning  for  1980.  Only  about  half  of  all 
missed  persons  were  omitted  because  their  housing  unit  was 
missed.  However,  nearly  three-fourths  of  blacks  omitted  were 
in  enumerated  housing  units.  That  is,  only  a  small  proportion 
of  blacks  was  missed  because  their  housing  unit  was  missed. 
The  omissions  of  persons  in  covered  households  were  largely 
caused  by  a  combination  of  such  factors  as  public  apathy  or 
indifference,  carelessness  or  confusion  in  filling  out  theforms, 
and  deliberate  concealment,  but  we  cannot  quantify  the 
relative  contribution  of  these  factors. 

The  1970  evaluation  program  also  included  a  match  of 
Medicare  records  with  the  census  [7]  .  This  study  showed  a 
gross  omission  rate  in  the  census  of  4.7  percent  for  persons 
65  years  of  age  or  over. 

Demographic  analysis  was  applied,  on  an  experimental 
basis,  to  the  problem  of  estimating  coverage  for  the  total 
population  of  States  in  1970  for  the  first  time  [10].  The 
1970  evaluation  program  also  included  studies  of  coverage  of 
the  Hispanic  population  [11]  and  of  the  implications  of 
undercount  for  political  representation  and  allocation  of 
funds,  in  general,  and  the  general  revenue  sharing  program, 
in  particular  [9]  . 

METHODS  FOR  EVALUATING  CENSUS 
COVERAGE 

The  Census  Bureau's  current  plan  for  evaluating  the 
coverage  of  the  1980  census  at  various  geographic  levels 
relies  on  three  basic  approaches.  Demographic  analysis  will 
be  used  to  produce  official  estimates  of  census  coverage  at 
the  national  level.  The  method  of  demographic  analysis 
will  also  be  used  to  prepare  State  estimates,  but  the  quality 
of  the  resulting  estimates  is  uncertain.  A  large-scale  sample 
survey,  if  successful,  would  provide  the  basis  for  coverage 
estimates  for  States  and  major  SMSA's  and  cities.  Persons 
interviewed  in  the  survey  would  be  matched  on  a  case-by- 
case  basis  to  the  census  and  possibly  to  various  administra- 
tive record  files  to  produce  the  undercount  estimates.  On  the 
basis  of  these  projects,  the  Census  Bureau  is  planning  to  issue 
official  estimates  of  census  coverage  for  the  Nation  and 
States,  as  well  as  the  largest  SMSA's  and  cities. 

The  Census  Bureau  is  currently  investigating  methodology 
that  would  permit  estimation  of  census  coverage  for  smaller 
geographic  areas.  These  coverage  estimates  would  be  based 
on  statistical  techniques,  such  as  regression  analysis  or  syn- 
thetic estimation,  and  would  employ  data  from  the  sample 
survey  conducted  following  the  census,  the  census  itself, 
and  possibly  other  sources.  According  to  current  plans,  the 
coverage  estimates  for  local  areas  would  be  experimental  in 


nature.  However,  should  the  Bureau  receive  a  clear  mandate 
or  be  directed  to  produce  local  area  coverage  estimates,  the 
same  techniques  will  probably  be  employed. 

Several  key  components  of  these  plans  for  producing 
coverage  estimates  are  currently  being  investigated  and 
reviewed  with  regard  to  their  feasibility  and  the  validity  of 
the  results  produced.  Thus,  the  plans  presented  here  should 
be  viewed  as  our  most  optimistic  plans,  i.e.,  those  to  be 
implemented  under  the  most  favorable  circumstances.  Un- 
fortunately, some  of  these  procedures  may  fail  to  produce 
the  required  data.  For  example,  we  at  the  Census  Bureau, 
as  well  as  many  persons  working  in  this  field,  are  concerned 
about  the  feasibility  of  accurately  matching  two  sets  of 
records.  Since  record  matching  is  an  essential  part  of  a  num- 
ber of  the  studies  to  be  discussed,  this  problem  and  its 
resolution  will  have  a  major  impact  on  the  final  form  and 
scope  of  the  studies  to  be  conducted  following  the  1980 
census. 

The  overall  plans  and  objectives  for  evaluating  the  cover- 
age of  the  1980  census,  within  the  limitations  noted  above, 
can  be  summarized  as  follows.  (It  should  be  noted  again 
that  the  dates  cited  for  results  of  match  studies  are  dependent 
on  whether  satisfactory  matching  can  be  carried  out.) 

1.  Preliminary  national  estimates  of  coverage 

a.  Total  population  as  estimated  by  demographic 
analysis  by  January  1,  1981. 

b.  Age,  sex,  and  race  (white,  black  and  other  races) 
estimates  derived  by  demographic  analysis  by 
mid-1981  and  from  the  match  studies  by  mid-1982. 

c.  Estimates  for  the  Hispanic  population  from  the 
match  studies  by  mid-1982. 

2.  Revised   national   estimates  of  coverage  for  age,  sex, 
and  race/origin  groups 

a.  "Demographic"  estimates  for  age,  sex,  and  race 
groups  by  mid-1982. 

b.  Combined  estimates  from  demographic  analysis  and 
the  match  studies  by  late  1983. 

c.  Estimates  from  the  match  studies  (primarily)  for 
Hispanic,  American  Indian,  and  Asian-  and  Pacific- 
American  populations  by  late  1983. 

3.  Estimates  of  coverage  for  States 

a.  Preliminary  estimates  from  the  match  studies  by 
mid-1982. 

b.  "Demographic"  estimates  by  late  1983. 

c.  Combined  estimates  from  demographic  analysis  and 
the  match  studies  by  late  1983. 

4.  Estimates  of  coverage  for  local  areas 

a.  Preliminary  estimates  for  major  cities  and  SMSA's 
from  the  match  studies  by  mid-1982. 
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b.  Experimentation  on  estimates  for  local  areas  based 
on  regression  and  synthetic  techniques. 

c.  Experimental  estimates  combining  demographic 
estimates  and  match  study  results  probably  with 
synthetic-regression  techniques  in  1984. 

National  Estimates 
Demographic  Analysis 

Demographic  analysis  as  a  tool  for  census  evaluation  in- 
volves developing  expected  values  for  the  population  in 
various  categories  (such  as  age,  sex,  race  categories)  at  the 
census  date  by  the  combination  and  manipulation  of  various 
types  of  demographic  data,  then  comparing  these  values  with 
the  corresponding  census  counts.  The  demographic  data  are 
drawn  from  sources  essentially  independent  of  the  census, 
such  as  birth,  death,  and  immigration  statistics,  historical 
series  of  census  data,  and  data  from  sample  surveys.  The 
accuracy  of  the  method  obviously  depends  on  the  quality 
and  logical  consistency  of  the  demographic  data. 

Demographic  analysis  will  provide  a  national  estimate  of 
net  underenumeration  for  the  total  population  and  national 
estimates  of  net  census  error,  combining  coverage  and  classi- 
fication errors  for  age,  sex,  and  race  (white,  black,  and  other 
races)  groups.  Following  the  1970  census,  this  method  was 
used  to  produce  the  official  estimates  of  coverage  for  the 
United  States  as  a  whole  (U.S.  Bureau  of  the  Census,  1974). 
(However,  no  corrected  figures  were  used  in  any  programs; 
general  revenue  sharing  and  the  Current  Population  Survey, 
for  example,  used  estimates  without  adjustment.)  The 
method  of  demographic  analysis  is  considered  by  the  census 
staff  to  be  more  effective  than  a  postcensal  sample  survey  for 
developing  satisfactory  estimates  of  net  undercounts  at  the 
national  level.  Consequently,  the  demographic  estimates  of 
census  coverage  are  envisioned  as  the  official  national  esti- 
mates for  1980. 

The  particular  procedure  used  to  estimate  coverage  for  the 
various  demographic  subgroups,  notably  age  groups,  depends 
on  the  nature  of  available  data  and  on  timing  requirements  of 
the  overall  evaluation  program.  For  groups  under  age  45  in 
1980,  i.e.,  persons  born  after  1935,  estimates  of  the  corrected 
population  will  be  developed  from  birth,  death,  and  immi- 
gration statistics.  For  the  population  over  age  65,  aggregate 
Medicare  data  will  provide  the  basis  for  coverage  estimates. 
For  the  remaining  ages,  45  to  64  years,  the  coverage  esti- 
mates will  be  extensions  of  the  estimates  for  ages  35  to  54 
in  1970;  these  were  derived  from  analysis  of  previous  cen- 
suses. Actual  death  statistics  will  be  used  to  allow  for  mor- 
tality up  to  age  74.  Official  immigration  statistics,  supple- 
mented by  estimates  of  other  legal  immigration,  illegal 
immigration,  and  emigration,  will  be  used  to  allow  for  net 
immigration. 

Different  methods  may  be  used  for  the  same  age  groups 
in   the   preliminary   and    revised    estimates    because   of  the 


availability  of  different  data.  In  some  instances,  it  is  not 
possible  to  specify  the  choice  among  alternatives  for  the 
revised  estimates  at  this  time.  It  should  be  noted  that  demo- 
graphic analysis  has  not  proven  successful  in  developing 
coverage  estimates  for  the  Hispanic  population.  Estimates 
of  coverage  for  this  group  in  1980  are  expected  to  be  ob- 
tained from  the  match  studies. 

Birth  and  death  statistics.  Registered  births  over  several 
decades  provide  a  direct  basis  for  estimating  the  corrected 
numbers  of  persons  in  most  age  groups  in  1980,  covering 
a  large  majority  of  the  total  population.  Statistics  on  regis- 
tered births  are  available  (by  race  and  sex)  for  all  States 
since  1933.  In  addition,  tests  of  birth  registration  complete- 
ness were  conducted  for  1940,  1950,  and  1964-68  that 
provide  correction  factors  for  these  years;  factors  for  other 
years  can  be  obtained  by  interpolation  and  extrapolation. 
These  data  will  be  used  in  estimating  census  coverage  for  the 
population  under  age  45  in  1980. 

One  way  of  deriving  U.S.  totals  of  births  corrected  for 
underregistration  is  to  aggregate  the  results  for  States  (by 
race).  The  State  estimates  of  births  corrected  for  under- 
registration, for  years  since  1935  and  for  race  groups,  will 
serve  also  as  basic  data  for  deriving  demographic  estimates 
of  coverage  for  States.  Several  projects  have  been  undertaken 
to  extend  and  improve  this  data  base.  If  successful,  this 
project  would  permit  the  estimation  of  coverage  for  States 
from  birth  statistics  for  ages  45  to  64  in  1980  also.  This 
technique  is  believed  to  be  greatly  superior  to  that  used  to 
produce  coverage  estimates  for  ages  35  to  54  for  States  in 
1970  (U.S.  Bureau  of  the  Census,  1977). 

Current  plans  call  for  using  registered  deaths  in  the  cal- 
culations with  no  correction  for  underregistration  or  mis- 
reporting  of  the  characteristics  of  decedents.  Alternative 
calculations  will  permit  investigation  of  effects  on  coverage 
estimates  of  allowances  for  possible  underregistration  of 
deaths,  particularly  for  infant  deaths  in  earlier  years.  Simi- 
larly, effects  of  age  misreporting  on  death  certificates  can  be 
investigated. 

Immigration  statistics.  Data  on  legal  immigrants  admitted 
to  the  United  States  for  1935  to  1980  will  be  used  in  esti- 
mating the  corrected  population  under  age  45  in  1980.  The 
Immigration  and  Naturalization  Service  (INS)  publishes  data 
on  immigrants  classified  by  age,  sex,  and  country  of  origin. 
To  be  included  in  the  immigration  component  of  the  esti- 
mates of  expected  population  are  other  data  items,  some 
supplied  by  INS.  These  include  net  arrivals  in  the  United 
States  from  Puerto  Rico,  net  arrivals  of  civilian  citizens,  and 
parolees.  The  quality  of  these  data    is    under  investigation. 

One  component  of  the  expected  population  in  1980  for 
which  data  are  lacking  is  illegal  immigration.  Obviously, 
because  of  the  nature  of  this  population,  an  accurate  estimate 
of  its  size  will  be  quite  difficult  to  make.  The  range  of 
existing  estimates  for  the  illegal  population  in  recent  years 
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is  quite  broad.  No  satisfactory  estimates  of  either  the  net 
flow  or  the  illegal  resident  population  in  the  United  States 
are  available.  A  variety  of  estimation  techniques  have  been 
used  to  try  to  establish  the  number,  but  the  true  number 
remains  unknown. 

The  Census  Bureau  has  undertaken  an  evaluation  of  the 
existing  studies  of  illegal  immigration  to  the  United  States 
and  is  investigating  various  approaches  to  the  estimation 
problem  in  addition  to  those  previously  employed.  On  the 
basis  of  the  existing  work  and  its  own  research,  the  Bureau 
expects  to  develop  a  range  of  working  estimates  of  the  illegal 
alien  population  to  be  included  in  the  estimate  of  the  ex- 
pected population  in  1980. 

Movement  out  of  the  United  States  (by  both  citizens 
and  aliens)  is  another  component  for  which  no  satisfactory 
data  exist.  The  methods  used  in  the  past  to  develop  emigra- 
tion estimates  present  problems  in  terms  of  timeliness, 
coverage,  consistency  and  accuracy  of  the  estimates  over 
time,  and  the  scope  of  the  assumptions  required  [4]  .  Con- 
sequently, the  Census  Bureau  has  been  considering  a  test  of 
network  (multiplicity)  sampling  in  conjunction  with  the 
Current  Population  Survey  in  late  1980  to  investigate  the 
feasibility  of  obtaining  information  on  emigration  from  the 
United  States. 

With  the  multiplicity  technique,  respondents  are  asked 
whether  certain  specified  relatives  have  emigrated  from  the 
United  States.  Persons  with  such  relatives  are  asked  further 
questions  to  obtain  information  on  the  emigrants  and  for 
weighting  the  sample  responses.  The  results  of  the  multi- 
plicity survey  will  not  be  available  for  the  preliminary  esti- 
mates of  undercount  so  that  indirect  estimation  techniques, 
such  as  those  based  on  INS  alien  registration  data  and 
Social  Security  data,  will  have  to  be  employed.  Should  the 
test  of  the  multiplicity  approach  prove  successful,  it  may 
be  possible  to  develop  emigration  estimates  on  the  basis  of 
a  full-scale  survey  conducted  some  time  in  1981 . 

Medicare  statistics.  The  corrected  population  aged  65 
years  and  over  for  age,  race,  and  sex  categories  will  be  de- 
veloped from  aggregated  Medicare  statistics.  Those  statistics 
will  be  adjusted  for  estimated  underenrollment  and  then 
compared  with  census  counts  to  obtain  undercount  esti- 
mates. The  factors  for  adjusting  the  Medicare  data  are  to  be 
obtained  as  a  by-product  of  the  proposed  match  studies. 

In  the  meantime,  two  alternatives  are  available  for  pro- 
viding preliminary  coverage  estimates  for  the  population  65 
and  over  in  age-sex-race  categories.  One  alternative  would 
involve  carrying  forward  the  corrected  population  aged  55 
and  over  in  1970  with  estimates  of  deaths  and  net  inter- 
national migration  [8]  .  Another  would  utilize  preliminary 
aggregated  Medicare  data  for  1980.  These  data  would  be 
corrected  for  underenrollment  on  the  basis  of  correction 
factors  developed  from  a  test  of  record-matching  tech- 
niques involving  CPS,  IRS,  and  Medicare  data  for  February 
1978. 


Preliminary  and  Revised  Estimates 

The  Census  Bureau  is  planning  to  release  at  least  two 
national  coverage  estimates  based  on  demographic  analysis: 
(1)  Preliminary  estimates  to  meet  the  demand  for  estimates 
at  the  earliest  possible  date;  and  (2)  revised  estimates  to 
incorporate  data  and  research  findings  that  become  available 
later.  Preliminary  estimates  of  coverage  for  the  total  popula- 
tion only  will  be  released  about  January  1 ,  1981 .  Preliminary 
coverage  estimates  for  age,  sex,  and  race  categories  should  be 
available  by  mid-1981.  Because  of  the  need  to  analyze  the 
census  data  fully  (particularly  the  racial  categories)  and  to 
complete  ongoing  research,  the  revised  "demographic"  esti- 
mates of  coverage  will  not  be  available  until  mid-1982. 

The  preliminary  and  revised  estimates  will  use  different 
data  and  methods  to  estimate  various  components  of  the 
corrected  population  in  1980.  For  ages  45  to  64,  the  pre- 
liminary estimates  will  be  extensions  of  the  estimates  for 
ages  35  to  54  in  1970,  based  on  analysis  of  previous  census 
data  [8]  .  The  revised  estimates  for  these  ages  may  be  based 
on  survivors  of  births  as  corrected  at  the  State  level  on  the 
basis  of  research  now  being  conducted.  Other  revisions 
involve  replacing  provisional  data  on  births,  deaths,  and 
immigration  for  1979  and  1980  with  final  data.  Because  of 
the  uncertainty  involved  in  estimating  illegal  immigration 
and  the  availability  of  new  data,  it  is  very  likely  that  the 
estimate  of  the  illegal  alien  population  will  be  modified  in 
the  revised  coverage  estimates.  Furthermore,  in  both  in- 
stances, a  range  of  estimates  may  be  employed. 

Problems  in  obtaining  the  required  data  prevent  com- 
pletion before  1983  of  revised  estimates  for  the  population 
over  65.  The  Medicare  files  for  April  1,  1980,  will  not  be 
complete  until  the  end  of  1980.  More  importantly,  however, 
since  a  match  involving  the  census  or  the  coverage  evaluation 
survey  and  administrative  records  is  required  to  develop  the 
revised  factors  for  adjusting  the  Medicare  data,  the  estimates 
cannot  be  completed  until  the  match  results  are  known, 
analyzed,  and  incorporated  into  the  estimation  procedure, 
that  is,  in  1983. 

Quality  of  the  "Demographic"  Estimates 

The  present  plans  of  the  Census  Bureau  rely  primarily  on 
the  use  of  demographic  analysis  for  deriving  preferred  esti- 
mates of  underenumeration  at  the  national  level.  When  the 
1970  estimated  undercount  was  announced  as  5.3  million, 
a  range  of  error  extending  from  4.8  to  5.8  million  was  also 
offered.  These  figures  did  not  represent  a  statistical  confi- 
dence interval  but  rather  the  possible  effect  of  errors  in  the 
components.  In  his  doctoral  dissertation,  Fay  [1]  estimated 
a  standard  deviation  in  the  undercount  estimate  of  0.5  to 
0.9  million  (but  it  should  be  noted  that  his  preferred  esti- 
mate of  the  undercount  was  6.1  million) 

The  development  of  "demographic"  estimates  of  under- 
coverage  requires  the  combination  of  data  from  a  number  of 
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sources.  Demographic  analysis  includes  correcting  these  data 
for  known  errors.  Even  with  these  corrections,  great  un- 
certainty remains  about  the  accuracy  of  specific  components, 
in  particular  emigration  and  net  illegal  immigration.  The 
Census  Bureau  is  currently  investigating  methods  for  esti- 
mating variances  for  the  demographic  estimates.  One 
approach  would  involve  combining  subjective  confidence 
intervals  for  the  components  with  conventional  statistical 
techniques.  By  its  very  nature,  a  census  undercount  can  be 
elusive  to  estimate  exactly,  but  demographic  analysis  seems 
to  give  reasonable  and  reliable  results. 

Estimates  for  States 

Demographic  Analysis 

A  "demographic"  approach  to  estimating  the  coverage  of 
State  populations  was  attempted  for  1970  [10]  .  The  basic 
approach  involved  first  estimating  the  coverage  of  the  popu- 
lation born  in  each  State  (for  ages  under  35  in  1970)  by 
comparing  survivors  of  births  with  the  census  data  on  the 
population  born  in  each  State.  Then,  several  different  pro- 
cedures and  assumptions  were  used  to  convert  the  coverage 
estimates  for  the  population  born  in  each  State  into  esti- 
mates for  the  population  living  in  each  State.  The  procedures 
and  assumptions  involved  a  number  of  parameters  for  which 
no  data  were  available.  Accordingly,  a  number  of  alternative 
sets  of  State  coverage  estimates  for  the  population  under  35 
were  derived  rather  than  a  single  preferred  set.  Other  un- 
certainties in  the  estimation  procedure,  particularly  the  lack 
of  any  reliable  methodology  for  ages  35  to  64,  led  to  other 
alternative  estimates  and  to  the  characterization  of  the 
estimates  as  "develoomental." 

For  1980,  we  plan  again  to  produce  "demographic"  esti- 
mates of  census  coverage  for  States.  Several  developments 
should  remove  some  of  the  uncertainties  in  the  estimation 
methods  to  the  point  where  it  may  be  possible  to  derive  a 
useful  set  of  estimates.  The  match  studies,  if  successful, 
would  provide  estimates  of  the  relative  coverage  of  lifetime 
interstate  migrants  and  nonmigrants.  This  parameter  is 
crucial  to  the  estimation  procedure;  values  had  to  be  assumed 
for  1970.  The  information  from  the  match  study  should 
permit  a  simplification  of  the  method,  including  elimination 
of  the  separate  calculation  of  coverage  estimates  for  States 
of  birth  and  the  resulting  necessity  of  converting  those 
estimates  to  represent  States  of  residence. 

The  data  on  births,  which  extend  back  to  1935,  will  cover 
a  larger  proportion  of  the  population  in  1980.  In  addition,  as 
previously  mentioned,  research  is  under  way  to  extend  the 
data  on  corrected  births  back  to  1925  or  1915.  If  successful, 
these  projects  would  virtually  eliminate  the  need  for  ratio 
estimates  in  the  middle  age  range.  We  also  hope  to  remove 
some  of  the  uncertainty  in  the  coverage  estimates  for  ages 
65  and  over  with  the  results  of  studies  being  conducted  at 
the  Bureau  of  the  Census  on  the  accuracy  of  residence  re- 
porting in  the  Medicare  files. 


The  "demographic"  estimates  of  coverage  for  States  are 
based  on  the  place-of-birth  data  collected  on  the  sample 
form.  Since  these  data  will  not  be  available  before  1982  and 
considerable  analytic  work  is  necessary,  the  coverage  esti- 
mates will  not  be  completed  before  late  1983.  The  quality 
of  the  data  on  State  of  birth  has  apparently  been  deterior- 
ating in  the  last  three  censuses,  e.g.,  higher  nonresponse 
rates  and  evidence  of  greater  misreporting.  Should  this  trend 
continue,  there  will  be  serious  problems  with  any  method- 
ology based  on  State  of  birth. 

Reinterviews  and  Record  Checks 

The  Bureau  of  the  Census  has  undertaken  a  considerable 
amount  of  research  on  the  feasibility  of  conducting  a  sample 
survey  as  soon  as  possible  after  the  census  enumeration  has 
been  completed  to  meet  the  demand  for  estimates  of  census 
coverage  for  States  and  various  local  areas.  Coverage  error 
would  be  estimated  by  matching  persons  listed  in  the  survey 
on  a  one-to-one  basis  with  the  census  listing  of  names.  The 
survey  would  be  designed  to  provide  reliable  estimates  of 
net  coverage  error  at  the  State  level  for  the  total  population, 
and  at  broader  geographic  levels,  such  as  regions  or  divisions, 
for  the  principal  races  and  the  Hispanic  population.  If  the 
matching  can  be  successfully  performed,  the  survey  would 
also  provide  estimates  of  net  coverage  error  for  the  total 
population  of  the  26  largest  SMSA's,  their  central  cities, 
and  six  SMSA's  and  their  central  cities  which  have  propor- 
tionally large  minority  populations.  The  survey  would  also 
provide  estimates  for  various  socioeconomic  categories  at 
the  national  and  regional  level.  (Demographic  analysis,  as 
such,  cannot  be  used  to  provide  estimates  of  coverage  error 
for  socioeconomic  categories  or  for  substate  areas.) 

Postenumeration  surveys  were  conducted  as  part  of  the 
1950  and  1960  census  evaluation  programs.  These  studies 
were  not  successful  in  providing  accurate  estimates  of  the 
undercount  for  certain  subgroups  of  the  population,  how- 
ever. Other  evidence,  including  estimates  derived  by  demo- 
graphic analysis  and  the  implausibility  of  the  sex  ratios 
shown  by  the  PES,  clearly  indicated  that  the  PES  estimates 
of  underenumeration  were  seriously  biased  downward.  This 
bias  was  especially  evident  for  black  males  aged  15  to  59, 
for  whom  the  PES  yielded  an  estimate  of  the  undercount 
that  was  approximately  one-half  the  estimate  provided  by 
demographic  analysis.  Erroneous  results  such  as  these  are 
apparently  caused  by  the  problem  called  "correlation  bias." 
This  bias  stems  from  the  fact  that  persons  enumerated  in  the 
census  tend  to  be  enumerated  in  the  survey  at  a  greater  rela- 
tive rate  than  persons  missed  in  the  census;  that  is,  persons 
missed  in  the  census  tend  not  to  be  reported  in  the  survey  for 
the  same  reasons  that  they  were  missed  in  the  census. 

Sample  survey  methodology.  A  considerable  amount  of 
research  has  been  conducted  as  part  of  the  census  pretests  in 
Oakland,  Calif,  and    Richmond,   Va.,  to  develop  a  method- 
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ology  for  a  special  survey  to  estimate  census  underenumera- 
tion.  This  research  has  been  designed  to  attempt  to  overcome 
some  of  the  problems  already  discussed.  The  two  post- 
enumeration  surveys  from  the  census  pretests  are  still  under- 
going analysis.  Two  aspects  of  this  research  may  have  a  large 
impact  on  our  planning:  The  difficulty  of  matching  and  the 
greater  than  anticipated  clustering  of  errors,  which  will  have 
a  direct  impact  on  the  precision  of  any  estimates  obtained. 

The  problems  of  matching  have  led  to  the  development 
of  two  questionnaires  combined  into  one  form.  The  first 
questionnaire,  called  Procedure  A,  lists  all  persons  at  the 
sample  housing  unit  at  the  time  of  the  census.  Procedure  B 
lists  all  persons  who  live  at  the  sample  address  and  deter- 
mines where  each  person  was  living  at  the  time  of  the  census. 

Both  procedures  can  be  used  to  obtain  estimates  of  the 
total  number  of  persons.  However,  the  results  obtained  from 
either  procedure  are  highly  dependent  on  how  well  matching 
can  be  done  from  the  survey  to  the  census.  Procedure  A  has 
the  advantage  of  providing  very  good  information  on  ad- 
dresses but  suffers  from  relatively  poor  information  on 
characteristics  of  persons  who  have  moved  in  the  several 
months  between  census  day  and  the  survey.  Procedure  B, 
on  the  other  hand,  provides  good  data  on  characteristics  of 
all  respondents,  but  relatively  poor  information  on  census 
day  addresses  for  persons  who  have  moved  since  the  census. 
Both  procedures  rely  heavily  on  being  able  to  take  what  in- 
formation is  available— both  the  address  where  each  person 
lived  on  census  day  and  demographic  data  on  each  person— 
and  match  back  to  the  census.  The  two  pretests  and  the  two 
earlier  postenumeration  surveys  indicate,  that,  despite  our 
best  efforts,  our  ability  to  match  successfully  and  correctly 
is  suspect. 

The  problems  with  matching  are  further  exacerbated  by 
the  fact  that  difficulty  in  matching  seems  to  be  related  to 
the  same  characteristics  that  underlie  the  undercount. 
Matching  seems  especially  difficult  in  rural  areas  and  in  areas 
with  high  concentrations  of  small  multiunit  buildings  (10 
units  or  less  in  a  structure).  Accordingly,  apparently  high 
undercoverage  rates  in  largely  rural  States  or  poor  urban 
areas  may  be  due  to  differential  rates  of  success  in  matching 
and  not  so  much  to  true  differences  in  the  undercoverage 
of  the  population.  Alternative  procedures  for  matching  and 
controlling  these  differences  are  still  being  tested  and  com- 
pared. 

The  second  major  discovery  from  the  pretests  was  the 
high  degree  of  clustering  of  census  misses  and  erroneous 
enumerations.  Within  each  State  or  SMSA,  a  sample  of  blocks 
is  selected  (with  a  possible  intermediate  stage  of  county 
selection).  The  blocks  are  completely  listed,  and  housing 
units  within  the  blocks  are  selected  as  the  final  sampling 
stage,  with  all  persons  in  a  housing  unit  listed  for  the  PES. 
The  correlation  within  households  for  census  errors  and  the 
correlation  between  housing  units  have  been  found  in  the 
pretests  to  be  higher  than  anticipated.  These  correlations 
lead  to  larger  than  anticipated  design  effects  in  the  variance 


estimates  for  underenumeration  rates.  Tne  problems  in 
matching  mentioned  earlier  combined  with  this  recent 
discovery  of  the  large  design  effects  in  the  pretest  make  it 
difficult  to  predict  exactly  how  accurate  the  PES  estimates 
will  be  for  States  or  substate  areas.  The  sample  allocation 
outlined  above  should  yield  the  most  reliable  estimates  of 
the  total  corrected  population  possible  for  a  sample  of 
250,000  housing  units,  but  a  question  remains  as  to  how 
reliable  these  estimates  would  be. 

Administrative  record  match  (ARM).  In  addition  to 
conducting  a  sample  survey  to  estimate  census  coverage,  the 
Census  Bureau  is  considering  the  use  of  additional  data  from 
"independent"  administrative  files  to  improve  the  estimates 
of  coverage  error.  To  the  extent  that  satisfactory  matching 
of  the  administrative  files,  the  census,  and  a  survey  can  be 
achieved  without  impairing  independence  of  the  sample  data, 
more  accurate  estimates  of  coverage  error  should  be  pro- 
duced than  in  1950  or  1960.  Two  administrative  files  are 
being  considered  for  this  purpose:  the  Internal  Revenue 
Service  (IRS)  tax  return  file  for  persons  aged  17  to  64  years 
of  age  and  the  Medicare  file  for  persons  65  or  older. 

The  feasibility  of  using  these  files  is  currently  being 
tested.  The  February  1978  Current  Population  Survey,  is 
being  used  as  a  proxy  for  a  postenumeration  survey  in  this 
test.  Data  were  collected  to  facilitate  a  match  with  the  ad- 
ministrative files.  Dual  systems  estimates  from  this  match 
for  the  total  "corrected"  population  for  February  1978  will 
be  compared  with  "demographic"  estimates  of  the  total 
corrected  population.  If  the  problems  of  matching  to  the 
administrative  files  prove  to  be  surmountable  and  the  dual 
systems  estimates  of  total  population  are  reasonable,  cover- 
age rates  based  on  the  administrative  records  could  be  used, 
along  with  "demographic"  estimates,  to  adjust  the  survey 
estimates  of  coverage  error  in  the  1980  census. 

Combination  of  Estimates  of  Census  Coverage 

Demographic  and  survey-ARM  estimates:  Nation  and 
States.  Once  estimates  of  coverage  are  available  from  the 
survey  and  the  administrative  record  match,  these  estimates 
can  be  combined  with  the  "demographic"  estimates.  Esti- 
mates from  match  studies  can  be  derived  for  detailed  demo- 
graphic, socioeconomic,  or  geographic  categories.  The 
"demographic"  estimates,  however,  will  be  suitable  only  for 
national  or  State  estimates  of  various  demographic  categories. 

The  assumptions  underlying  the  combination  of  data  from 
the  different  sources  are  that  the  "demographic"  estimates 
are  more  accurate  than  the  estimates  based  on  matching 
methods  at  the  national  level,  but  the  matching  studies  are 
better  for  measuring  differences  between  geographic  areas 
and  are  the  only  basis  for  measuring  coverage  differences 
between  socioeconomic  subgroups  in  the  population.  Esti- 
mates of  coverage  for  State  populations  are  expected  to  be 
available   from   demographic  analysis  and  from  the  match 
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studies.  At  this  time,  the  relative  quality  of  the  two  types 
of  estimates  cannot  be  known.  Should  either  prove  to  be 
unacceptable,  the  other  will  be  used  alone.  If,  however,  both 
sets  of  estimates  are  acceptable,  it  would  seem  desirable  to 
combine  them,  taking  the  variances  of  the  estimates  into 
account  in  the  weighting  procedures.  Variance  estimates 
would  be  available  for  the  survey  estimates  as  part  of  the 
evaluation  of  the  results.  Research  is  now  in  progress  for 
developing  estimates  of  the  variance  of  demographic  esti- 
mates of  coverage  for  States. 

A  possible  procedure  for  combining  the  "demographic" 
estimates  and  the  match  study  estimates,  which  takes  ad- 
vantage of  the  better  features  of  both  types  of  estimates, 
would  involve  using  the  demographic  estimates,  particularly 
the  national  estimates,  as  "controls"  or  marginal  totals  for 
the  estimates  from  the  survey.  The  final  product  of  the 
estimation  procedure  would  be  sets  of  tables  produced  from 
the  results  of  the  survey  (or  a  merger  of  the  survey  and 
demographic  analysis)  "raked"  to  marginal  totals  that 
correspond  to  the  analytic  estimates  for  age,  race,  and  sex 
groups  nationally.  The  resulting  estimates  would  be  the 
adjusted  counts  for  States,  large  SMSA's,  and  large  cities. 
They  would  not  be  available  before  mid-1983. 

Estimates  for  Sub-State  Areas 

If  census  data  are  to  be  adjusted  for  allocation  of  funds, 
there  is  a  need  for  estimates  of  census  coverage  for  all  cities, 
counties,  and  other  local  units  of  government.  Since  develop- 
ing reliable  coverage  estimates  for  States  and  large  SMSA's 
requires  a  very  large  sample,  the  Census  Bureau  obviously 
cannot  afford  a  survey  to  develop  coverage  estimates  for 
smaller  areas.  Accordingly,  the  Bureau  is  conducting  research 
into  other  techniques  for  producing  coverage  estimates  for 
sub- State  areas.  At  this  point,  it  appears  that  any  estimates 
produced  will  be  experimental  in  nature.  Techniques  for 
validating  sub-State  estimates  of  census  coverage  have  not 
been  developed. 

Regression  and  Synthetic  Estimation 

Two  alternative  procedures,  regression  and  synthetic 
estimation,  are  being  considered  for  obtaining  estimates 
of  census  coverage  for  smaller  areas.  The  postcensal  survey 
is  being  designed  to  produce  data  that  could  be  utilized  by 
these  procedures.  Broadly  speaking,  the  sample  is  being  de- 
signed to  provide  reliable  estimates  of  the  corrected  popula- 
tion of  specified  minorities  and  specified  socioeconomic 
categories  for  areas  broader  than  States  (e.g.,  regions,  or 
urban-rural  populations).  One  application  of  regression 
analysis  to  estimating  net  undercount  for  counties  might 
start  with  the  counties  that  are  in  the  sample.  Regression 
models  for  these  areas  would  be  developed  in  conjunction 
with  data  collected  in  the  survey  and  the  census.  These 
models  would  then  be  applied  to  counties  not  in  the  sample. 


Our  research  in  this  area  involves  the  determination  as  to 
what  alternative  regression  models  might  be  used,  what 
variables  are  important  to  the  model,  and  what  transforma- 
tions on  the  data  might  be  needed.  Other  research  is  being 
conducted  into  the  possibility  of  using  smaller  areas  (e.g., 
blocks)  or  larger  areas  (e.g.,  county  groups)  as  the  basis  for 
regression  models. 

Synthetic  estimation  of  census  coverage  involves  applying 
coverage  rates  for  specific  segments  of  the  population  (e.g., 
racial,  socioeconomic,  or  residence  categories)  at  a  given 
geographic  level  (e.g.,  the  United  States)  to  the  population 
at  some  subordinate  level  (e.g.,  States).  For  example,  syn- 
thetic coverage  rates  for  counties  might  be  derived  by  apply- 
ing regional  coverage  rates  for  race/Hispanic  origin  and  in- 
come classes  to  county  populations  disaggregated  by  income 
and  race/Hispanic  origin.  Synthetic  estimation  has  been  used 
in  a  variety  of  applications  [2,  3,  5] ,  including  illustrative 
examples  for  evaluating  the  effects  of  adjusting  census  data 
for  undercoverage  [9]  .  However,  comparison  of  "demo- 
graphic" estimates  of  coverage  and  simple  synthetic  estimates 
(based  on  age,  race,  and  sex  only)  for  States  indicates  that 
synthetic  estimates  not  only  fail  to  capture  the  full  range  of 
variation  in  coverage  rates,  but  also  differ  greatly  from 
demographic  estimates  for  some  States. 

The  Census  Bureau  is  conducting  research  in  the  area  of 
synthetic  estimation.  As  noted,  the  survey  might  provide 
coverage  rates  for  categories  other  than  the  basic  demo- 
graphic ones  and  for  areas  smaller  than  the  entire  country. 
It  is  likely  that  synthetic  estimates  based  on  such  categories 
would  be  more  accurate  than  the  simple  synthetic  estimates 
employing  race  only,  or  race,  age,  and  sex  only,  as  compo- 
nents. Further,  the  synthetic  estimates  for  areas  below  the 
State  level  could  be  adjusted  to  the  combined  match  study- 
"demographic"  totals  for  States.  Alternative  synthetic  tech- 
niques are  being  compared  as  well  as  alternative  levels  of 
aggregation.  For  any  synthetic  estimation  procedure,  the 
results  of  the  match  studies  should  prove  quite  helpful  for 
validation  of  the  estimates. 

Simple  synthetic  estimates  based  on  race  only  could  be 
produced  as  early  as  mid-1981.  Addition  of  Hispanic  origin 
to  the  categories  requires  match  study  estimates,  which  are 
not  to  become  available  even  in  preliminary  form  until  mid- 
1982.  More  refined  synthetic  and  regression  estimation 
would  require  combined  demographicvanalysis-match  study 
data,  which  will  not  be  available  until  late  1983.  Again, 
synthetic  and  regression  approaches  to  estimation  of  census 
coverage  of  sub-State  areas  are,  at  this  time,  considered  ex- 
perimental. The  resulting  estimates  may  not  be  sufficiently 
accurate  to  warrant  their  use. 

IMPLICATIONS  FOR  PUBLIC  PROGRAMS 

The  amount  of  public  money  disbursed  to  States,  cities, 
and  other  local  areas  on  the  basis  of  population  data  has 
become  substantial.  For  example,  since  the  inception  of  the 
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general  revenue  sharing  program  in  1972,  over  $42  billion 
has  been  distributed  in  a  7-year  period  (1972-78)  to  over 
39,000  governmental  units.  With  this  amount  of  money  at 
stake,  the  considerable  interest  in  the  effect  of  census  under- 
coverage,  particularly  local  area  differences,  on  the  distribu- 
tion of  funds  is  not  surprising. 

Statistical  Considerations  in  Equity 

Concerns  regarding  the  effect  of  census  errors  are  usually 
stated  in  terms  of  equity  or  fairness  of  the  fund  allocations 
Notions  of  equity  are  fundamental  to  our  system  of  political 
representation,  taxation,  and  public  expenditure,  yet 
"equity"  is  a  difficult  concept  to  define.  If  it  is  defined  as 
"everyone  receiving  a  fair  share,"  it  is  then  necessary  to 
define  what  is  "fair."  Generally  what  is  fair  is  not  necessarily 
obvious  and  consequently  equity  can  be  difficult  to  measure. 

Equity  in  allocation  of  resources  can  be  characterized  in 
terms  of  three  separate  aspects:  need,  effort,  and  capacity. 
Need,  in  this  typology,  is  the  underlying  requirement  for 
assistance,  particularly  in  terms  of  public  services,  and  may 
be  represented  by  population.  Effort  measures  the  resources 
already  being  contributed  to  meeting  the  need;  sometimes 
effort  is  measured  relative  to  total  resources.  Capacity  refers 
to  the  resources  potentially  available  to  meet  the  need  and 
may  be  represented  by  per  capita  income.  Not  all  aspects  of 
equity  necessarily  come  into  play  in  all  situations;  determi- 
nation of  political  representation,  amount  of  taxation,  and 
allotment  of  revenues  all  require  somewhat  different  char- 
acterization of  equity. 

Defining  equity  in  revenue  allocation  (or  taxation)  is  not 
the  duty  of  the  Census  Bureau  or  any  other  statistical  agency. 
The  meaning  of  equity  must  normally  be  taken  as  the  intent 
of  Congress  as  embodied  in  the  law.  Congressional  intent, 
as  embodied  in  Federal  grant-in-aid  formulas,  allocates 
resources  on  the  basis  of  need,  capacity,  and  effort— the 
factors  previously  mentioned.  Each  factor  is  assumed  to  be 
observable,  in  some  sense,  at  the  State  or  local  level;  then 
each  one  must  be  defined  operationally  in  terms  of  some 
statistical  measure.  Congress  generally  leaves  to  the  Federal 
statistical  agency  the  choice  of  data  source  or  of  estimation 
procedure  for  generating  the  required  data  series.  Thus, 
equity  considerations  relate  to  how  well  census  data  (or 
other  data  generated  by  the  Census  Bureau)  produce  allo- 
cations that  correspond  to  congressional  intent. 

In  connection  with  data  used  for  allocation  of  funds, 
factors  that  contribute  to  departures  from  equity  include 
bias,  variance,  cost,  timeliness,  and  appropriateness.  The 
first  two  factors  are  familiar  to  statisticians;  cost  is  all  too 
familiar  to  everyone.  Timeliness  denotes  the  extent  to  which 
the  time  frame  of  the  data  is  the  same  as  required  by  the 
formula.  Appropriateness  can  be  defined  as  the  extent  to 
which  the  concept  being  used  (no  matter  how  well  measured) 
corresponds  to  the  intent  of  the  law.  Clearly  the  distinction 
between  bias  and  inappropriateness  is  arbitrary,  depending 
largely  upon  what  the  analyst  believes  is  being  measured. 


Distinguishing  between  bias  and  variance  depends  on  the 
underlying  model.  For  example,  if  the  probability  of  being 
counted  in  the  census  is  assumed  to  be  constant,  all  differ- 
ential undercount  of  areas  is  variance.  Alternatively,  if  being 
enumerated  is  assumed  to  depend  on  race,  then  failure  to 
correct  for  differential  undercount  between  whites  and 
blacks  leaves  "bias"  in  the  counts;  any  remaining  differ- 
ential undercount  beyond  that  accounted  for  by  differences 
in  racial  composition  is  "variance."  Extensions  of  such 
models  bring  more  information  into  the  bias  category. 
Equity  considerations  might  require  that  known  biases  in 
the  data  be  corrected.  However,  "known"  biases  tend  to  be 
those  that  are  measurable,  not  necessarily  those  that  are 
largest  or  most  important.  Furthermore,  correcting  a 
"known"  bias  with  an  estimate  may  introduce  considerable 
"variance,"  with  the  resulting  failure  to  correct  the  erroneous 
fund  allocation. 

Cost  and  timeliness  have  a  place  in  equity  discussions 
also.  Reduced  error  and  thus  increased  equity  in  fund  allo- 
cations has  a  cost.  If  costs  are  conceived  of  as  coming  out 
of  the  pool  of  funds  to  be  allocated,  then  at  some  point  it 
is  to  no  one's  advantage  to  spend  more  money  to  reduce 
error.  Well  before  that  point,  it  will  not  be  to  the  advantage 
of  most  jurisdictions  to  spend  more  to  reduce  statistical 
error.  The  timeliness  of  adjustments  can  affect  equity.  For 
some  adjustment  procedures,  the  required  estimates  will 
take  several  years  to  complete.  In  the  meantime,  fund  allo- 
cations will  have  to  be  based  on  estimates  derived  from 
unadjusted  census  data  or  data  adjusted  with  preliminary 
figures. 

Impact  of  Census  Errors  on  Fund  Allocations 

In  assessing  the  effects  of  data  errors  on  fund  allocations, 
two  dimensions  must  be  considered.  First,  we  need  to  con- 
sider whether  the  allocation  is  on  a  per  capita  basis  or  ap- 
portioned on  a  competitive  basis  and,  second,  we  need  to 
consider  whether  the  funds  are  allocated  on  the  basis  of 
total  population,  some  segment  of  the  population,  or  some 
factor  or  factors  in  addition  to  population,  such  as  per 
capita  income. 

Capitation  grants,  or  funds  distributed  on  a  per  capita 
basis,  allocate  a  fixed  amount  of  money  per  eligible  person 
to  each  subdivision: 

Dj  =  KNf 

where   D  -is  the  amount  allocated  to  the  /th  subdivision, 

K  is  the  fixed  amount  per  eligible,  and  N  j\%  the  eligible 

population  in  the  /th  subdivision.  For  funds  distributed  on 
a  per  capita  basis,  the  total  amount  distributed  depends  on 
the  size  of  the  particular  population,  without  reference  to 
the  population  of  other  areas.  Any  data  error  obviously 
affects  the  distribution  and,  if  there  is  an  undercount  of 
population,  there  is  an  underallocation  of  funds. 
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Typically,  however,  Federal  funding  programs  have  an 
apportionment  feature;  that  is,  a  preestablished  sum  of 
money  is  distributed  to  a  class  of  governmental  units  such 
as  States  or  States  and  their  political  subdivisions.  One 
version  of  proportionate  allocation  distributes  funds  to 
subdivisions  in  proportion  to  the  eligible  population: 


D:  =  D   x    (A/,- 
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where  D  is  the  fixed  amount  to  be  allocated.  If  funds  are 
distributed  solely  on  the  basis  of  population  under  such  an 
apportionment  formula  and  if  the  population  of  the  coordi- 
nate governmental  units  is  adjusted  for  underenumeration 
by  a  common  percentage,  however  large,  the  funds  appor- 
tioned to  each  governmental  unit  clearly  would  not  be 
affected  by  the  adjustment.  It  should  be  recognized,  there- 
fore, that  shifts  in  the  funds  apportioned  to  governmental 
units  (State  or  local)  depend  on  the  variation  in  the  under- 
enumeration rates  among  the  areas. 

Clearly, under  competitive  apportionment,  discussions  of 
equity  should  not  relate  to  the  accuracy  of  the  estimate  for  a 
particular  area,  but  rather  to  the  accuracy  of  the  estimates  for 
all  areas  in  a  set;  that  is,  how  well  does  the  estimated  distri- 
bution reflect  the  unknown  "true"  distribution.  There  are 
many  ways  to  measure  these  differences,  for  example,  sum 
of  absolute  errors,  sum  of  squared  errors,  sum  of  underalloca- 
tions,  etc.  The  choice  of  a  measure  of  inequity  is  itself  a 
value  judgment  and  would  imply  certain  notions  of  equity. 
The  papers  by  Spencer  and  Fellegi  presented  at  this  confer- 
ence discuss  some  of  these  issues. 

When  the  formula  for  distributing  funds  involves  factors 
in  addition  to  population,  the  adequacy  of  the  data  on  these 
additional  factors  (e.g.,  per  capita  income)  also  has  a  bearing 
on  the  adequacy  of  the  allocations.  The  importance  of  these 
other  factors,  vis-a-vis  population,  is  often  overlooked  in  the 
assessment  of  the  impact  of  data  errors  on  the  equity  of  the 
distribution  of  public  funds. 

The  Census  Bureau  has  carried  out  several  studies  that 
assess  illustratively  the  effect  of  census  underenumeration 
on  the  distribution  of  funds  under  various  public  programs. 
In  an  earlier  study  [9] ,  population  counts  for  States  cor- 
rected by  synthetic  procedures  were  employed  to  illustrate 
the  effect  of  underenumeration  on  apportionment  formulas. 
It  was  found  that  under  an  apportionment  rule  based  on 
population  size  alone,  the  size  and  variation  of  the  percent- 
age shifts  in  funds  allocated  to  States  resulting  from  a  cor- 
rection of  census  counts  would  be  far  less  than  the  population 
undercount  rates.  It  was  also  found  that  more  States  would 
lose  money  than  would  gain  money  under  the  changed  ap- 
portionment. The  1975  study  also  examined  the  effect  of 
underenumeration  and  the  underreporting  of  income  on  the 
distribution  of  general  revenue  sharing  funds  at  the  State 
level. 

A  later  companion  study  [6]  assessed  the  effects  of  data 
corrections   on   general     revenue-sharing    allocations  among 


the  counties  and  local  areas  in  two  States,  New  Jersey  and 
Maryland.  This  study  clearly  demonstrates  that,  under  the 
general  revenue  sharing  formula,  the  adjustment  for  popu- 
lation undercount  in  the  population  factor  alone  of  the 
formula  results  in  little  change  in  funds  apportioned.1  The 
effect  of  the  population  adjustment  is  dampened  consider- 
ably as  a  result  of  the  apportionment  feature  of  the  formula. 
The  effect  of  the  population  adjustment  is  even  less  if  popu- 
lation in  the  population  factor  and  population  in  the  tax 
effort  factor  are  both  adjusted,  as  is  more  reasonable.2  The 
funds  apportioned  would  be  altered  to  a  much  greater  extent, 
especially  at  the  sub-State  levels,  if  both  population  and 
income  were  fully  adjusted  for  understatement.3  Income, 
not  population,  then  emerges  as  the  dominant  element  in 
affecting  the  revenue-sharing  allocations  when  the  data  are 
corrected  for  understatement  of  the  components. 

The  results  of  the  study  also  reflect  the  tendency  for  most 
areas  to  lose  money  under  an  adjustment,  whether  of  popu- 
lation only  or  of  population  and  income  combined.  Most 
areas  tend  to  move  in  the  direction  of  their  "proper"  allot- 
ment (i.e.,  the  allotment  that  would  be  received  where  all 
factors  are  adjusted)  even  when  population  alone  is  adjusted, 
but  for  most  areas  this  means  a  loss  of  funds  rather  than  a 
gain.  When  population  rather  than  areas  is  examined,  this 
tendency  generally  holds;  that  is,  under  most  adjustments, 
the  majority  of  people  live  in  areas  that  would  lose  money. 
The  situation  is  less  clear  for  the  black  population  considered 
separately.  For  some  of  the  sets  of  adjusted  data,  the  ma- 
jority of  blacks  live  in  areas  that  would  gain  money. 

The  finding  that  the  majority  of  States  and  areas  (in  New 
Jersey  and  Maryland)  would  lose  funds  if  the  population  and 
income  factors  are  corrected  in  the  revenue-sharing  formula 
could  have  significant  political  implications.  If  this  pattern 
can  be  generalized  to  the  distribution  of  funds  in  all  States, 
it  may  be  expected  that  proposals  to  adjust  for  data  errors 
used  in  revenue  sharing  will  be  strongly  opposed  by  the 
governmental  units  (which  form  a  majority)  that  stand  to 
lose  funds  by  the  change. 


'The  population  factor  was  adjusted  for  undercoverage  by  the 
synthetic  method.  The  basic  3-factor  revenue  sharing  formula  that 
determines  the  allocation  amounts  for  localities  may  be  represented 
in  simple  form  by: 

Px  (P/l)  x  (T/l)  =P  x  (P/l)  x{T-[Px  (1/P)]  }    =  (P/l)2  xT 

where  P  represents  population,  I  aggregate  money  income,  (P/l) 
per  capita  income,  T  net  adjusted  taxes,  and  (T/l)  tax  effort. 

2The  population  factor  and  the  population  element  in  the  tax 
effort  factor  were  adjusted  for  undercoverage  by  the  synthetic 
method.  Population  cannot  be  adjusted  in  the  per  capita  income 
factor  unless  the  aggregate  income  factor  is  also  adjusted  for  the 
income  of  the  persons  added  by  the  adjustment,  i.e.,  the  per  capita 
income  figure  as  estimated  is  more  accurate  than  a  figure  derived  by 
adjusting  the  population  component  only. 

3The  population  factor  and  the  population  element  in  the  tax 
effort  factor  were  adjusted  for  undercoverage  by  the  synthetic 
method;  the  per  capita  income  and  tax  effort  factors  were  adjusted 
for  income  underreporting  by  substituting  BEA  estimated  income 
for  census  data. 
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It  is  not  easy  to  define  or  achieve  equity  of  allocations 
in  the  face  of  data  errors.  With  less  than  perfect  data,  the 
allocation  of  funds  will  be  less  than  optimal.  The  function  of 
the  statistician  is  not  necessarily  to  provide  error-free  data, 
but  rather  to  try  to  identify  the  largest  errors  and  try  to 
control  them.  However,  when  discussions  of  equity  focus 
on  preexisting  allocation  formulas  for  large  sums  of  money, 
the  discussions  necessarily  move  beyond  the  statistical 
realm  to  encompass  the  political.  Therefore,  papers  presented 
at  this  conference  cover  both  realms. 
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INTRODUCTION 

The  way  censuses  are  taken  has  a  bearing  on  the 
undercount  and  hence  on  the  use  of  the  census  for  allocation 
and  apportionment.  Much  of  what  follows  is  a  discussion  of 
census  practice.  (More  extended  discussion  is  provided  in 
National  Research  Council  [4]  and  Keyfitz  [3]  .)  The  con- 
cluding section  shows  what  can  be  done  about  it;  i.e.,  it  shows 
the  several  ways  that  the  parties  using  the  census  for  alloca- 
tion of  funds  can  come  to  agreement.  The  list  of  options 
given  is  intended  to  be  exhaustive  in  the  sense  that  any 
proposal  for  handling  the  undercount  would  fit  under  one 
of  the  three  headings.  This  paper  expresses  no  preference 
among  the  options,  but  attempts  to  set  forth  the  advantages 
and  drawbacks  of  each.  The  reader  who  is  concerned  only 
with  action  on  the  undercount  can  proceed  directly  to  the 
concluding  section  and  see  where  his  preferences  fall. 

ERROR  IS  A  PART  OF  CENSUS  TAKING 

The  most  easily  written  part  of  articles  on  census 
completeness  is  an  exhortation  to  the  Bureau  of  the  Census 
to  do  better,  to  take  the  1980  census  exactly  with  no  errors. 
The  writers  of  such  articles  are  often  L.iaware  that  over  the 
last  40  years  the  Bureau  of  the  Census  has  pioneered  in  the 
reduction  of  census  error.  The  errors  that  remain  are  not  due 
to  negligence  or  ignorance  on  the  part  of  Bureau  personnel, 
who  know  more  about  errors  of  counting  than  any  other 
group  in  the  world.  The  difficulties  with  which  this  con- 
ference is  concerned  will  not  be  removed  by  change  of 
management  or  adoption  of  any  obvious  new  methods.  It 
will  take  a  very  clever  journalist  to  see  more  deeply  into  the 
problem  of  completeness  than  have  the  series  of  brilliant 
census  leaders  of  the  past  40  years. 

Yet  the  nagging  thought  persists  that  the  census  really 
ought  not  to  be  appreciably  incomplete.  If  200  passengers  in 
an  airplane  can  be  counted  exactly,  with  zero  error,  why  not 
200  million?  The  census  is  taken  by  dividing  the  country  on 
maps  into  small  areas  containing  an  average  of  1,000  persons 
and  assigning  the  responsibility  for  each  area  to  one 
enumerator.  Zero  error  in  each  of  220,000  areas  would  mean 
zero  error  for  the  country  as  a  whole.  There  are  a  number  of 
things  wrong  with  this  commonsense  view,  and  until  it  is 
disposed  of  the  public  is  going  to  be  impatient  with 
discussions  of  the  kind  for  which  this  conference  has  been 
called.  My  main  point  concerns  the  penumbra  of  irremovable 
arbitrariness  around  any  permissible  way  of  taking  the  census. 


Anyone  can  think  of  ways  of  taking  a  more  nearly 
complete  census  if  some  of  the  constraints  can  be  relaxed.  If 
people  could  be  required  to  stay  home  for  one  day,  or  until 
the  enumerator  calls;  if  people  could  be  given  a  button 
indicating  that  they  had  been  enumerated,  and  required  to 
wear  it;  if  this  or  some  other  means  of  showing  that  they  had 
been  enumerated,  perhaps  a  card,  were  required  for  trans- 
acting such  business  as  cashing  a  paycheck,  drawing  social 
security,  being  attended  by  a  doctor,  etc.,  they  would  have 
an  immediate  interest  in  being  included  in  the  census.  Such 
devices  have  not  been  found  acceptable  in  the  United  States. 
They  might  be  applied  in  totalitarian  societies,  though  there 
it  turns  out  that  other  inefficiencies  intervene;  censuses  taken 
in  the  U.S.S.R.  have  been  bad,  and  one  even  had  to  be 
abandoned  before  publication. 

Public  impatience  tends  to  be  proportional  to  the 
quantity  of  funds  distributed  on  the  basis  of  the  census 
count.  Yet  error  is  an  integral  part  of  counting;  the 
difference  between  a  precise  survey  and  a  poor  one  is  in  the 
amount  of  error,  not  in  that  one  contains  errors  and  the 
other  doesn't. 

DIMINISHING  RETURNSTO  EFFORT 

One  of  the  lessons  of  statistics  is  that  the  several  sources 
of  error  in  any  survey  have  to  be  seen  in  relation  to  one 
another.  With  large  unremovable  sources  of  error  it  may  not 
pay  to  remove  smaller  sources.  To  give  an  example,  where 
population  and  income  are  obtained  from  sample  surveys  and 
allocation  is  to  be  made  on  the  product,  suppose  that  income 
per  head  is  subject  to  a  standard  error  of  10  percent  and 
population  to  a  standard  error  of  1  percent,  these  errors 
being  independent  of  one  another.  If  we  could  do  nothing 
about  the  error  of  income,  but  by  doubling  the  expenditure 
on  the  census  we  could  reduce  the  population  error  to  0.7 
percent,  then  without  the  doubling  of  expenditure  we  have 
an  overall  standard  error  of  10.05  percent;  with  the  double 
expenditure  on  the  census,  we  have  a  standard  error  of 
10.025  percent.  A  100-percent  increase  in  expenditure 
increases  the  accuracy  of  the  required  product  by  2.5  parts 
per  thousand. 

DEFINITIONAL  BOUNDARIES 

One  source  of  definitional  uncertainty  is  whether  "popu- 
lation" includes  illegal  aliens.  The  issue  is  being  bitterly 
debated  to  the  point  of  putting  the  census  itself  in  jeopardy 
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(New  York  Times,  Dec.  21,  1979;  Feb.  7,  1980).  The 
Constitution,  as  modified  by  the  14th  Amendment,  orders 
the  counting  of  "the  whole  number  of  persons  in  each  State, 
excluding  Indians  not  taxed."  This  could  well  be  taken  as 
including  illegal  aliens,  but  a  lawsuit  has  been  entered  suing 
for  their  omission  from  the  census.  Yet  the  line  between 
legal  and  illegal  immigrants  has  become  steadily  less  clear  in 
recent  years;  both  obtain  substantial  social  services  and  both 
pay  sales  and  other  taxes.  The  question  whether  people 
should  be  here  ought  to  be  addressed  to  the  Immigration  and 
Naturalization  Service;  the  census  is  only  concerned  with 
whether  they  are  here. 

The  matter  is  not  trivial  for  the  State  and  cities  that  have 
large  numbers  of  them.  In  the  apportionment  of  the  House 
of  Representatives,  for  instance,  it  is  said  ir  press  reports 
(though  in  the  nature  of  the  case  no  one  knows)  that  if  illegal 
aliens  are  counted,  California  would  gain  six  extra  seats  and 
New  York  would  gain  three.  Kansas,  Washington,  and  a 
dozen  other  States  would  correspondingly  lose  one  repre- 
sentative each,  the  House  total  being  fixed  at  435.  New  York 
City  would  gain  5  percent  of  its  allocation  of  Federal  funds 
if  the  illegals  are  counted.  These  numbers  are  clearly  ex- 
aggerated and,  in  any  case,  the  illegals  would  be  subject  to 
gross  omission  no  matter  how  the  census  set  the  definition. 

It  is  hard  to  imagine  the  question  of  illegal  aliens  arising 
when  the  Constitution  was  framed  or  amended.  Only,  now 
when  redistributive  legislation  has  made  them  important,  the 
matter  is  viewed  with  deadly  seriousness. 

Here  is  an  example  (others  will  arise  later)  of  what  may  be 
called  the  hardening  of  expectations.  When  Congress  first 
offered  to  pay  for  school  lunches,  say  at  so  much  per  year 
per  child  in  school,  no  one  was  likely  to  ask  whether  children 
here  illegally  would  partake.  But  once  the  program  has  been 
going  for  some  time  and  is  thoroughly  incorporated  in  the 
community's  receipts  and  expenditures,  attention  comes  to 
be  focused  very  intensively  on  such  marginal  questions. 

Though  undocumented  aliens  are  the  largest,  and  legally 
the  most  interesting,  of  the  groups  that  give  rise  to  dispute 
on  the  part  of  those  who  will  benefit  from  a  particular 
definition  of  population,  there  are  a  host  of  other  points  on 
which  census  practice  could  be  challenged  once  the  door  is 
opened.  College  students  "re  to  be  counted  as  a  resident  of 
the  college  community  where  they  study,  not  of  their 
parents'  home,  even  though  they  may  be  with  their  parents 
when  the  enumerator  calls.  On  the  other  hand,  children  in 
a  residential  secondary  school  are  counted  in  the  household 
of  their  parents,  irrespective  of  where  they  happen  to  be  at 
the  time  of  the  census.  A  town  with  a  large  secondary 
boarding  school  could  well  sue  to  have  the  students  con- 
sidered as  residents. 

Since  there  is  no  general  definition  of  resident,  the  census 
has  to  decide  where  to  draw  its  many  boundary  lines.  The 
lines  should  be  sharp  and  objective  on  the  one  hand  and 
suited  to  the  concept  of  "usual  residence"  on  the  other. 
These  two  considerations  may  conflict.  In  many  instances 


the  sharp  definition  is  not  appropriate  to  the  use  of  the 
results.  Persons  who  have  more  than  one  home  and  divide 
their  time  between  them  are  to  be  listed  where  they  spend 
the  largest  part  of  the  calendar  year,  according  to  the  census 
instructions.  It  would  be  sharper  to  put  them  where  they 
spend  the  largest  part  of  the  current  week  or  where  they 
are  actually  found  at  the  time  of  enumeration,  but  this 
would  be  less  in  accord  with  the  objective  of  finding  the 
usual  population. 

To  the  arbitrariness  incorporated  in  the  census  definitions 
must  be  added  the  errors  in  implementation  by  the  enu- 
merator. A  census  is  an  intricate  affair  hedged  about  by 
arbitrary  definitions,  enumerated  by  people  who  cannot  but 
add  their  own  errors  in  transcribing  the  (not  always  exact) 
information   to   those   provided  them   by  the  respondents. 

SPENDING  MORE  MONEY 

Beyond  all  questions  of  definition  is  the  matter  of  those 
who  should  have  been  included  in  the  census  and  who  were 
just  not  caught.  About  half  of  these  were  in  households  that 
were  not  reported  on  any  list  and  not  known  to  the  post 
office.  The  other  half  were  members  of  a  known  household 
but  were  somehow  omitted  from  its  census  questionnaire, 
typically,  because  the  person  responding  for  the  household 
failed  to  report  the  individual.  Either  there  was  no  evidence 
of  their  existence  such  as  would  put  the  enumerator  on  their 
trail,  or  there  was  evidence  but  the  enumerator  was  delin- 
quent. An  enumerator  will  make  one,  two,  three,  .  .  .  call- 
backs for  a  person  not  at  home  on  the  first  call,  but  a  time 
comes  when  the  most  devoted  enumerator  gives  up. 

The  Bureau  has  shown  exceptional  skill  in  arranging 
publicity  for  the  1980  census.  Newspapers  and  radio  and 
television  stations,  especially  including  those  run  by  minori- 
ties, have  realized  the  importance  of  the  census  and  have 
already  given  a  great  deal  of  free  and  favorable  publicity.  But 
all  this,  along  with  millions  of  dollars  of  expenditure  on 
preparation,  can  be  swamped  by  random  unfavorable 
publicity  on  the  eve  of  the  census.  Uncontrollable  circum- 
stances, like  the  unwillingness  of  ecclesiastical  authorities  to 
reassure  their  constituents  on  the  confidentiality  of  the 
census  form,  can  work  against  completeness. 

Aside  from  all  this,  the  quality  and  completeness  of 
enumeration  will  be  affected  in  1980  by  some  of  the  social 
changes  that  we  see  around  us.  Housewives  were  the 
backbone  of  the  enumeration  force  in  earlier  times,  and 
other  housewives  usually  answered  the  doorbells  they  rang. 
Now  many  of  this  group  have  regular  jobs;  an  excellent 
source  of  census  labor  has  dried  up,  on  the  one  hand,  and,  on 
the  other,  the  enumerators  have  to  do  much  of  the  job  in  the 
evening. 

In  addition  to  there  being  more  people  constituting 
one-person  households,  which  have  always  been  harder  to 
enumerate  than  families,  people  are  more  mobile,  many 
people  are  suspicious  of  government,  and  more  are  concerned 
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about  their  privacy.  There  are  more  laws;  hence,  more  people 
have  something  to  conceal— from  lodgers  in  a  zone  where 
they  are  prohibited  to  workers  enjoying  unreported  incomes. 
Registration  of  youth  for  the  Armed  Forces,  proposed  2 
months  before  the  census  date,  will  cause  some  reticence. 
On  the  other  side,  the  Census  Bureau  is  applying  new  tech- 
niques and  spending  much  more  money  than  it  did.  One  can 
only  hope  that  this  will  offset  the  greater  difficulties. 

HANDLING  THE  NOT  STATED 

The  unavoidable  arbitrariness  of  the  census  does  not  end 
with  the  definitions  as  specified  in  the  instructions  nor  with 
the  (one  suspects  highly  variable)  interpretations  of  these  by 
respondents  and  enumerators.  It  continues  through  the 
processing  of  census  schedules.  In  1970,  some  4.5  million 
individuals  for  whose  existence  there  was  more  or  less 
evidence,  but  for  whom  no  information  on  characteristics 
was  reported,  had  to  be  incorporated  in  the  tabulations  by 
some  kind  of  calculation.  The  convenient  way  of  doing  this, 
which  has  been  used  since  1960,  is  to  duplicate  the  record 
for  the  last  person  enumerated  whose  characteristics  were 
reported.  Thus,  the  Bureau  of  the  Census  takes  advantage  of 
any  homogeneity  of  local  areas  in  respect  of  income  and 
other  features,  since  residential  segregation  of  many  kinds  is 
still  a  fact.  When  data  are  missing  for  a  number  of 
consecutive  individuals,  the  computer  searches  beyond  the 
nearest  person  completely  enumerated;  the  Bureau's  rules  do 
not  permit  duplication  of  one  individual  more  than  three 
times. 

Such  duplication  of  characteristics  is  preferable  to  retain- 
ing "not-stated"  entries  through  the  tabulation  to  the 
published  volumes.  As  was  pointed  out  as  early  as  the  1940's 
by  Deming,  Hansen,  and  others,  not-stated  entries  are  both 
costly  to  the  printing  and  provide  very  little  information. 
The  user  is  saved  trouble,  the  printing  bill  is  reduced,  and 
accuracy  is  served  if  the  nearest  known  case  is  duplicated. 
Aside  from  the  4.5  million,  all  of  whose  characteristics  were 
obtained  in  this  way,  many  others  lacked  information  on  one 
or  a  few  questions.  Income  was  often  omitted,  and  since  it  is 
important  in  many  allocation  formulas,  it  was  duplicated  for 
some  1 5  percent  of  cases. 

Yet  few  users  of  census  data  need  to  refer  to  these  fine 
points,  which  affect  the  margins  of  the  count,  not  its  core. 
Think  of  the  typical  questions  asked  of  the  census:  Are 
people  marrying  younger  than  they  did?  Do  the  rich  have 
fewer  children  than  the  poor?  Which  service  occupations  are 
declining,  which  increasing,  as  we  move  into  the  postindus- 
trial  society?  Enumeration  error,  or  the  arbitrariness  of 
definitions,  matters  hardly  at  all  for  these.  Error  is  tolerable, 
and  it  would  be  wasteful  to  enumerate  everyone;  the  error  of 
a  5-percent  sample  is  perfectly  acceptable. 

None  of  this  latitude  is  so  readily  accepted  when  the 
census  is  used  for  allocation.  This  article,  having  no  sug- 
gestions for  taking  a  perfect  census,  will  explore  the  ways  of 
living  with  imperfection. 


TARGETING  THE  LEGISLATION 

There  is  a  clear  criterion  for  how  the  census  ought  to  be 
carried  out  in  view  of  its  use  for  allocation.  Insofar  as  people 
constitute  costs  for  a  jurisdiction,  that  is  where  they  should 
be  counted.  When  Congress  allocates  funds  for  a  school 
breakfast  program,  its  intention  is  best  served  if  the  funds  are 
given  to  jurisdictions  according  to  the  number  of  their  school 
children  poor  enough  to  need  the  breakfast.  Funds  for 
supplementary  benefits  to  the  unemployed  ought  to  be 
distributed  in  proportion  to  the  number  of  long-term 
unemployed,  etc.  One  need  only  say  this  to  realize  what  a 
crude  instrument  any  allocation  formula  must  be.  When 
Congress  apportions  funds  according  to  population  and  a  few 
other  measurable  variables,  deviations  from  the  target  of 
hundreds  of  percent  are  easily  possible.  Based  on  the 
formula,  a  city  may  get  an  allowance  for  something  it  does 
not  need  at  all.  This  does  not  mean  that  the  100  or  so 
Federal  laws  covering  support  for  education,  health,  trans- 
portation, housing,  manpower,  and  other  programs  [2] 
are  misconceived;  they  are  necessarily  targeted  by  some  de- 
terminate simple  formula  that  is  only  more  or  less  related  to 
need. 

It  fortunately  happens  that  a  series  of  censuses  provides 
more  information  than  any  one  census.  If  we  want  to  know 
how  many  people  were  present  in  the  United  States  in  1980, 
we  must  make  use  not  only  of  the  1980  census,  but  also  of 
those  of  1970,  1960,  and  earlier. 

OBTAINING  INFORMATION  FROM  A  SERIES 
OF  CENSUSES 

It  is  not  true  that  the  most  up-to-date  census  of  a  series 
provides  all  the  information  available  on  the  population  at  a 
given  time.  There  is  indeed  uncertainty  that  increases  year  by 
year  because  of  uncertainty  about  how  many  persons  died  or 
emigrated  and  how  many  were  added  by  birth  and  immigra- 
tion. To  infer  how  many  people  are  here  now,  using  the  1970 
census,  exposes  us  to  errors  in  the  components  of  change, 
and  the  same  errors  would  make  the  1960  and  1950  censuses 
less  useful  yet. 

Yet  these  errors  of  projection  can  be  less  than  the  known 
and  persistent  differences  in  the  completeness  of  enu- 
meration at  different  ages.  If  we  know  that  children  10  to  14 
years  of  age  at  last  birthday  are  more  precisely  counted  than 
people  20  to  24,  then  it  could  be  better  to  estimate  the  20- to 
24-year-olds  in  1980  not  from  the  1980  census,  but  from  the 
1970  census,  with  adjustment  for  migration  and  deaths.  If  the 
additional  error  of  the  ages  20  to  24  enumeration  is  much 
greater  than  the  error  of  adjustment  by  the  components, 
then  we  should  estimate  the  number  of  those  20  to  24  in 
1980  entirely  from  the  1970  census  and  disregard  the  1980 
count.  The  ages  10  to  14  cohort  in  1970  can  be  supple- 
mented by  births  of  1956-60,  which  projects  to  a  1980  figure 
for  the  20  to  24  age  group  that  may  or  may  not  be  more 
accurate  than  the  projection  of  the  1970  census. 
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This  is  one  of  the  ways  in  which  we  will  know  the 
completeness  of  the  1980  or  any  other  census  that  is  part  of 
a  series.  As  a  cohort  ages,  it  passes  through  successive 
censuses,  being  enumerated  more  completely  in  some  than  in 
others.  In  the  case  of  U.S.  blacks,  enumeration  is  grossly 
incomplete  under  age  5  and  again  at  ages  20  to  65;  between 
ages  5  and  20  the  count  is  relatively  complete.  The  method 
does  not  allow  for  persons  enumerated  twice;  it  assumes  that 
the  census  at  which  the  cohort  is  counted  highest  is  the  most 
accurate. 

Births  have  been  more  or  less  completely  registered  since 
the  1930's,  so  for  the  people  born  since  then,  we  need  not 
depend  on  censuses  at  all.  The  older  population  is  almost 
completely  registered  for  Medicare,  so  they  also  are  known  in 
total,  independently  of  the  censuses.  From  these  and  other 
sources  the  overall  completeness  of  the  census  can  be  cal- 
culated. 

Let  us  suppose  that  the  uncertainty  due  to  errors  in 
migration,  etc.,  increases  at  the  rate  of  0.2  percent  per  year, 
so  that  at  the  end  of  a  decade  a  range  of  2  percent  is  added 
to  our  ignorance  of  the  cohort;  at  the  end  of  2  decades,  a 
range  of  4  percent,  etc.  Given  that  ages  10  to  14  in  1960  were 
only  4.4  percent  short,  adding  an  error  of  2  percent  would 
still  be  better  than  using  the  1970  enumeration,  which  at 
ages  20  to  24  seems  to  have  been  8.5  percent  short.  Even  the 
census  of  20  years  ago  would  at  this  rate  be  better  than  the 
current  census  as  far  as  age  interval  25  to  34  years  is 
concerned,  it  having  shown  12.5  percent  incompleteness  in 
1970.  (Figures  for  blacks  reflect  Census  Bureau  estimates.) 

If  we  could  think  of  the  count  as  a  random  variable,  and  if 
errors  of  updating  the  cohort  were  also  random,  then  the 
right  way  to  estimate  the  number  of  a  cohort  now,  say  the 
number  of  persons  40  to  45  years,  would  be  to  take  all  the 
preceding  censuses  and  average  their  projected  numbers,  with 
smaller  and  smaller  weights  going  back  in  time.  Unfor- 
tunately, this  procedure  applies  in  a  clearcut  way  only  to  the 
United  States  as  a  whole.  For  States,  the  unknown  migration 
may  be  large  enough  to  make  previous  censuses  obsolete.  Yet 
something  has  been  done  in  the  Bureau  of  the  Census  for 
States. 

HYPOTHESES  ON  THE  UNDERCOUNT 

Through  the  work  of  J.S.  Siegel  and  his  associates  [6] , 
based  on  methods  such  as  those  described  above,  we 
have  some  indication  of  the  amount  and  distribution 
of  the  undercount.  Presumably  the  smallest  areas  are  sub- 
ject to  the  largest  fraction  of  undercount.  This  is  not  easy 
to  demonstrate  in  detail,  since  the  smaller  the  area,  the  less 
sure  we  are  of  what  the  undercount  is,  but  one  item  of 
evidence  is  obtained  by  comparing  the  distribution  of  $1 
billion  according  to  the  census  as  enumerated  and  according 
to  the  census  as  adjusted  by  the  basic  synthetic  method,  i.e., 
by  multiplying  each  State  by  the  correction  for  the  Nation  as 
a  whole,  recognizing  groups  of  age,  sex,  and  race.  It  turns  out 


that  the  adjustment  for  the  South  Atlantic  States  as  a  whole 
is  +0.5  percent,  i.e.,  the  group  would  obtain  0.5  percent 
more  on  the  adjusted  than  on  the  unadjusted  count.  But  the 
individual  States  range  from  -0.5  (West  Virginia)  to  +4.1 
(District  of  Columbia)  in  percent  difference.  Similar  results 
are  found  on  comparing  the  total  for  the  other  regions  with 
the  individual  States  [6]  . 

Only  five  States  (including  the  District  of  Columbia)  had 
an  adjustment  of  more  than  1  percent,  but  four  of  these  (all 
but  Hawaii)  would  have  gained  by  an  adjustment.  My 
hypothesis  is  that,  in  general,  the  gains  through  adjustment 
are  more  concentrated  (by  area,  race,  and  in  other  respects) 
than  the  losses.  The  concentration  of  positive  effects  of 
adjustment  (i.e.,  of  relative  undercount)  would  cause  the 
States  that  expect  to  gain  by  adjustment  to  complain  strongly 
and  leave  the  ones  that  lose  indifferent.  The  result  of  this 
could  well  be  a  strong  push  for  adjustment  and  only  weak 
resistance  to  it.  When  a  fixed  sum  is  to  be  divided  there  is 
no  net  gain  or  loss  to  the  whole  country,  but  nonetheless 
the  net  drive  to  adjust  can  be  strong. 

One  might  develop  a  statistical-political  model  of  the 
situation  where  there  are  many  small  losers  by  an  adjustment 
and  a  few  large  gainers.  If  the  drive  to  adjust  is  convex  below, 
so  that  the  State  (or  other  jurisdiction)  with  twice  as  much 
to  gain  exercises  more  than  twice  the  pressure  for  adjust- 
ment, then  the  net  pressure  will  be  in  favor.  Such  a  model 
needs  study  by  anyone  trying  to  forecast  the  balance  of 
pressures  on  the  census. 

EQUITY  AND  EXPECTATIONS 

The  concentration  of  the  undercount  in  certain  age 
categories  is  in  one  sense  an  advantage  in  that  it  allows  a 
series  of  censuses  to  provide  a  more  exact  total  than  any  one; 
so,  it  can  be  used  to  adjust  the  current  census.  The 
concentration  in  region  and  race  groups  is  wholly  a  disadvan- 
tage in  that  it  produced  a  distribution  of  Federal  funds 
different  from  what  Congress  intended.  Often  it  is  the  very 
groups  Congress  aimed  most  clearly  at  helping  that  are 
understated.  The  problem  is  to  rectify  the  apportionment 
but  not  to  overturn  the  census  in  so  doing. 

The  argument  of  this  paper  is  that  unanticipated  random 
error  has  to  be  accepted.  If  to  the  true  apportionment  there 
were  added  a  random  component  of  expected  value  zero,  this 
would  be  no  worse  for  States  than  the  sort  of  random 
variation  that  any  jurisdiction  is  subject  to  from  many 
causes.  It  would  be  comparable  with  bad  growing  weather  in 
Kansas,  or  having  one  of  its  sons  become  President  and  thus 
producing  a  tourist  boom  for  Plains,  Ga.  Random  events  that 
bring  income  or  force  expenditure  are  not,  in  general,  thought 
to  require  Federal  compensation. 

It  is  nonrandom  variation  known  in  advance  that  arouses 
just  resentment.  It  seems  only  fair  that  the  census  results 
should  be  adjusted  in  the  simplest  possible  way  that  will 
achieve   the    intentions   of    Congress.   One   such   way   is  to 
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increase  not  the  census  figure  but  the  payment  in  accord 
with  the  relative  understatement.  If  the  net  1980  undercount 
is  estimated  at  1.9  percent  of  the  enumerated  for  whites  and 
7.7  percent  for  blacks,  the  payment  for  blacks  would  be 
raised  by  7.7  —  1.9=  5.8  percent.  A  municipality  would  be 
given  a  bonus  of  5.8  percent  for  its  blacks  to  offset  relative 
census  incompleteness  or  whatever  corresponding  figure  was 
appropriate  for  1980.  We  know  less  about  the  understate- 
ment of  the  Spanish  population,  and  the  same  bonus  could 
be  given  to  it. 

Such  an  adjustment  has  the  advantage  of  unpretentious- 
ness—  no  one  would  make  the  mistake  of  thinking  it 
produced  correct  figures  in  each  locality.  The  net  adjustment 
for  about  45  of  the  States  would  be  less  than  1  percent.  It 
would  not  reduce  the  incentive  to  complete  enumeration  in 
any  one  jurisdiction.  One  hopes  it  would  be  temporary,  and 
the  Census  Bureau  would  attain  sensibly  equal  completeness 
for  the  races  by  1990  or  2000.  Even  if  it  never  eliminates  the 
undercount  overall,  it  could  eliminate  the  inequity  arising 
from  differential  undercount. 

THE  LOGIC  OF  NEGOTIATION 

A  related  aspect  of  our  theme  is  the  logic  of  negotiation, 
to  which  game  theory  is  applicable.  If  A  is  willing  to  sell  his 
house  for  $80,000  or  more  and  B  is  willing  to  buy  for 
anything  up  to  $100,000,  then  without  further  data,  if  these 
two  are  the  only  participants  in  the  purchase  and  sale,  the 
price  is  indeterminate  in  a  range  of  $20,000.  Negotiation  is 
likely  to  be  casual  to  the  point  where  $5,000  or  $10,000 
alteration  in  the  price  can  be  conceded  on  an  impulse  by 
either  party.  It  is  possible  that  A  proposes  $95,000  and  B 
proposes  $85,000  and  they  genially  split  the  difference  at 
$90,000. 

Once  they  have  signed  up  for  $90,000,  further  alterations 
will  not  be  made  casually.  The  seller  wants  to  take  away  a 
light  fixture  worth  $35  because  it  would  go  well  in  his  new 
home;  the  buyer  sternly  points  out  that  light  fixtures  were 
implicitly  sold  as  part  of  the  house  in  the  deal  that  both 
agreed  to. 

In  application  to  our  census  undercount  problem,  the 
parties  are  far  more  likely  to  come  to  agreement  before  the 
census  is  taken  than  after  the  count  is  known.  To  leave 
conditions  open  in  any  important  respect  is  like  leaving 
unspecified  elements  in  a  contract— it  makes  agreement  very 
difficult  and  leads  to  highly  unprofitable  litigation.  The 
sooner  all  participants  develop  expectations  consistent  with 
what  is  going  to  happen,  the  fewer  the  occasions  of 
disagreement.  The  Census  Bureau  has  been  acting  in  accord 
with  this  principle  in  informing  local  authorities  of  its 
procedures  in  the  utmost  detail;  perhaps  it  could  go  further. 

NEGOTIATION  PRESSES  TO  THE  MARGIN 

Once  the  essential  matter  is  settled— that  payments  will  be 
made    in    proportion    to   population— negotiation   shifts  to 


marginal  variations  in  the  definition  and  count  of  popula- 
tions. An  incentive  is  offered  to  seek  out  any  part  of  the 
census  procedure  that  could  be  made  to  seem  wrong,  an 
incentive  amounting  to  millions  of  dollars  in  the  case  of  a 
medium-sized  city,  to  hundreds  of  millions  in  the  case  of 
New  York.  The  dynamics  here  could  destroy  the  best  census 
in  the  world.  All  possible  census  procedures  contain  arbitrary 
elements,  and  it  is  in  the  nature  of  census  taking  that  no 
census  can  stand  up  to  such  partisan  examination. 

Every  piece  of  the  legislation  we  are  concerned  with  has  a 
target,  and  it  should  have  a  convention  that  will  provide  a 
reasonable  approximation  to  the  target.  That  a  convention 
can  be  accepted  that  everyone  knows  is  only  a  rough 
approximation  to  the  target  is  shown  by  the  200-year  history 
of  apportionment  in  the  House  of  Representatives.  Alabama 
was  actually  given  7  representatives  on  the  basis  of  the  1970 
census,  but  census  figures  corrected  by  methods  beyond 
challenge,  using  the  previous  census  and  other  data,  would 
entitle  it  to  8;  and  California  would  give  up  one  of  its  43 
[6] .  After  most  censuses,  at  least  one  State  turns  out  to  be 
so  deprived  because  the  convention  (in  this  case  the  census 
count  as  published  by  the  Bureau)  is  not  quite  on  target.  Yet 
no  one  says  that  democracy  is  frustrated  by  California  having 
one  seat  too  many  and  Alabama  one  too  few. 

DETERMINACY  VERSUS  PRECISION  IN 
POSTCENSAL  ESTIMATES 

On  apportionment  of  funds,  the  problem  before  us  is  to 
choose  among  conventions.  The  choice  involves  a  conflict 
between  determinacy  and  precision  that  may  be  illustrated 
by  the  way  postcensal  adjustments  can  be  made. 

One  convention  is  to  take  the  published  count  at  the  last 
preceding  census  and  stay  with  it  for  10  years.  This  has  the 
advantage  that  each  jurisdiction  would  know  at  the  start 
what  it  was  to  get  and  could  make  plans  for  disbursement. 
But  some  jurisdictions  might  protest  that  they  had  been 
growing  and  were  likely  to  grow  in  the  future,  so  the  "last 
census"  formula  would  give  them  less  than  Congress  intended 
them  to  have.  The  point  might  be  met  by  saying  that  every 
jurisdiction  would  get  the  straight-line  projection  of  its 
population  from  the  last  two  censuses.  This  would  have  the 
same  advantage  of  being  immediately  known  and  calculable 
and  does  not  discriminate  against  the  growing  parts  of  the 
country.  The  future  is  sufficiently  unknown  that  this  does 
not  discriminate  against  anyone  in  an  obvious  way. 

Intercensal  estimates  are  not  the  subject  of  this  con- 
ference, and  I  cite  them  only  to  bring  out  some  of  the  issues, 
in  particular  to  help  see  the  choice  to  be  made  between 
accord  with  the  objective  of  the  legislation  and  accuracy.  For 
clearly  anyone  can  do  better  than  the  straight-line  pro- 
jections with  such  supplementary  material  as  building 
permits,  city  directories,  or  local  counts  made  after  the  last 
census.  But  these  would  depend  on  how  the  person  making 
them  chose  among  various  items  of  ancillary  data.  Is  a 
straight-line    extrapolation,    which    gives   exactly   the   same 
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number  irrespective  of  who  makes  it,  preferable  to  a  pro- 
fessional estimate  that  depends  on  someone's  decision  to 
trust  building  permits  rather  than  a  city  directory?  The 
user  who  wants  determinacy,  simplicity,  and  objectivity 
above  all  will  take  the  first;  the  one  who  wants  precision  and 
fidelity  to  the  object  of  the  legislation  will  take  the  second. 
The  choice  of  intercensal  estimates  is  analogous  to  the  choice 
on  the  number  to  accept  for  the  census  itself. 

THE  DECISION  TREE 

To  adjust  the  census  or  not  to  adjust  it  is  far  from  a 
symmetrical  choice.  The  unadjusted  result  that  comes  out  of 
the  standard  and  traditional  Census  Bureau  procedures  is  a 
single  possibility;  adjustment  is  many  possibilities.  Once 
adjustment  is  decided  on,  it  would  have  to  be  decided  further 
whether  (a)  to  use  an  objective  and  uniform,  though 
admittedly  inaccurate,  synthetic  method,  or  (b)  a  subjective 
method  that  would  give  greater  precision.  If  the  former, 
whether  the  base  figure  would  be  the  whole  of  the  United 
States  or  individual  regions.  Figure  1  shows  part  of  the 
decision  tree  that  would  be  required  if  there  is  to  be 
adjustment. 

It  might  be  decided  to  match  with  the  postenumeration 
survey  (PES),  find  the  people  included  in  the  PES  that  were 
missed  in  the  census,  identify  their  characteristics,  and  cor- 
rect the  count  for  each  jurisdiction  by  regression.  But  the 
size  of  the  PES  sample  would  not  suffice  to  provide  data  for 
individual  jurisdictions.  Hence,  the  ratios  or  regressions 
would  have  to  be  obtained  for  some  larger  area,  perhaps  a 
State  or  region,  and  applied  to  all  individual  jurisdictions 
within  the  larger  area;  the  procedure  has  come  to  be  called 
synthetic.  To  decide  how  large  the  area  should  be  from 
which  the  coefficients  would  be  obtained  necessitates  a 
tradeoff  between  sampling  error  (which  grows  greater  as  one 
attempts  to  derive  coefficients  from  smaller  areas)  and  ap- 
propriateness of  the  coefficients  (which  grow  greater  as  the 
area  from  which  they  are  derived  becomes  smaller).  Beyond 
this  is  the  question  of  what  characteristics  shall  be  recognized 
in  the  regressions:  Would  one  go  beyond  race  to  age  and  sex, 
and  beyond  that  to  income,  size  of  dwelling,  etc.?  The  de- 
cision diagram,  as  drawn  up  here,  contains  only  a  small  part 
of  the  choices  that  would  have  to  be  made  once  we  start 
down  the  road  of  adjustment. 

Beyond  the  decisions  indicated  in  the  tree  are  other 
choices.  Would  the  census  be  adjusted  for  all  purposes  down 
to  the  finest  tabulations?  This  could  be  done  consistently 
very  easily  by  computer.  For  example,  if  it  were  decided  that 
each  black  enumerated  was  to  count  for  1.06,  then  that 
could  be  incorporated  individual  by  individual  in  program- 
ming the  tabulations,  which  would  then  come  out  consistent 
with  one  another.  (The  alternative  of  multiplying  up  the 
finished  tables  to  adjust  to  the  new  total  has  the  disadvantage 
that  different  cross-tabulations  would  not  be  consistent  with 
one  another.) 


Such  a  diagram  provides  a  framework  within  which  each 
person  can  find  his  or  her  preference;  that  is,  both  the  good 
and  the  bad  of  it.  In  a  sense,  it  is  guilty  of  complicating  the 
task  before  this  conference  by  suggesting  different  numbers 
for  different  users.  Some  will  want  a  straightforward  method 
that  anyone  can  verify  from  published  census  figures  on  his 
own  scratch  pad;  others  will  seek  every  last  gain  in  accuracy, 
even  at  the  cost  (however  great)  of  complexity. 

RESEARCH 

What  is  the  role  of  knowledge  in  all  this?  Any  trained 
demographer  can  use  outside  information  to  ascertain  that 
the  census  is  short  and  can  produce  some  kind  of  correction. 
It  is  little  more  trouble  to  do  this  separately  for  whites  and 
nonwhites.  Such  calculations  that  would  use  births,  social 
security,  Medicare,  and  other  noncensus  data,  according  to 
the  judgment  of  the  particular  demographer,  would  produce 
a  result  better  than  the  census— 1-  or  2-  or  3-percent 
incompleteness  with  the  material  of  1980  might  be  shown. 
The  cost  of  this  would  be  a  few  thousand  dollars  worth  of  a 
demographer's  time. 

But  what  about  the  Hispanics?  Unlike  whites  and 
nonwhites,  Hispanics  cannot  be  identified  in  consistent 
fashion  on  past  birth  and  death  certificates,  and  this  makes 
the  extra  census  calculation  much  more  elusive.  The  Census 
Bureau  [7]  has  tried  to  see  what  it  could  make  of  aggre- 
gate statistics;  its  resulting  bulletin  does  not  lay  claim  to 
any  dazzling  result.  If  the  matter  is  important  enough, 
one  could  resort  to  a  sample  in  the  field,  but  then  the  cost 
rises  from  thousands  of  dollars  to  millions.  One  would  have 
to  draw  a  sample  of  the  population  (presumably  with 
heavier  sampling  ratios  in  areas  where  Hispanics  were  con- 
centrated) and  then  make  a  name-by-name  check  to  the 
census  to  see  what  fraction  was  missed.  When  this  was  done 
with  the  population  as  a  whole  for  1970,  it  came  up  with 
a  fraction  missed  less  than  the  demographic  calculation  using 
outside  sources.  For  the  Hispanics,  there  might  be  greater 
difficulties  than  for  other  elements— for  instance,  native 
white— so  we  can  be  sure  that  the  undercount  as  estimated 
by  an  independent  survey  and  matching  would  be  low.  It  is 
not  inconceivable  that  5  percent  of  Hispanics  would  really 
be  missed  in  the  census,  and  the  fresh  enumeration  plus 
matching  process  would  discover  an  undercount  of  only  3 
percent. 

All  this  is  at  the  national  level.  For  individual  jurisdic- 
tions, some  kind  of  synthetic  method  is  inevitable— using 
either  the  undercount  ratio  for  the  United  States  as  a  whole 
or  that  for  a  region  or  State.  Some  experimenting  has  been 
done  on  the  error  of  a  synthetic  estimate  and  more  would  be 
useful.  Savage  [5]  outlines  an  extensive  research  program, 
and  clearly  we  ought  to  get  all  the  knowledge  we  can  of  the 
effects  of  various  proposed  methods  of  adjustment.  But  work 
on  this  would  share  a  feature  common  to  all  research;  no 
one  can  be  sure  in  advance  that  it  will  succeed  and  produce 
usable  results. 
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Figure  1 .     A  Partial  Decision  Tree  for  the  1980  Census  Showing  the  Asymmetry 
Between  "No  Adjustment"  and  "Adjustment" 
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CONCLUSIONS:  THE  THREE  OPTIONS 

The  problem  of  the  undercount  is  part  of  a  wider 
problem:  At  its  margins  the  1980  census  of  the  United 
States,  like  any  census  anywhere,  is  arbitrary.  This  is  not  in 
the  sense  that  the  census-taking  agency  makes  up  the  figures, 
which  it  does  not;  it  follows  very  strict  and  carefully  worked 
out  procedures.  It  is  arbitrary  in  the  sense  that  other 
procedures  could  be  devised  that  would  produce  slightly 
different  results,  yet  be  technically  defensible.  The  effects  of 
the  decisions  made  at  the  margin  are  numerically  small,  and 
they  would  hardly  affect  most  of  the  uses  of  the  census, 
which  are  concerned  with  providing  information.  There  is 
one  use  that  they  do  affect— the  allocation  of  funds.  For 
information  purposes,  a  1 -percent  variation  is  negligible;  for 
allocation  of  funds,  it  means  millions  of  dollars.  The  problem 
is  a  new  one  to  statisticians  and  demographers,  and  the  books 
do  not  even  discuss  it,  let  alone  provide  answers. 

This  applies  to  the  decisions  that  determine  the  published 
count;  for  example,  putting  college  students  into  the  area 
where  they  live  while  at  college  rather  than  where  their 
parents  live  and  the  opposite  for  students  in  secondary  resi- 
dential schools.  It  applies  even  more  to  any  allowance  for  the 
undercount  that  would  be  made  on  top  of  the  enumeration. 

Because  of  this  variability,  the  distribution  of  funds  is 
possible  only  on  the  basis  of  an  agreed-on  convention.  One 
way  of  securing  agreement  among  all  parties  is  by  discussion 
in  advance  of  the  census,  before  the  numbers  that  come  out 
of  the  several  ways  of  taking  the  census  are  known.  All  that 
is  needed  for  such  agreement  is  an  unbiased  expected  value. 
After  the  count,  when  it  is  known  that  New  York  will  gei 
more  on  one  reasonable  adjustment  or  New  Mexico  on 
another,  voluntary  agreement  is  impossible,  and  a  solution 
has  to  be  imposed.  It  could  be  imposed  by  the  legislature  or 
by  some  public  body  of  representative  citizens  approved  by 
the  legislature.  It  could  be  imposed  by  the  Secretary  of 
Commerce  without  such  a  body,  using  powers  delegated  by 

Congress. 

In  the  establishment  or  a  convention,  both  political  and 

technical  considerations  enter.  The  Bureau  of  the  Census  has 

pioneered   in  the   improvement  of   its  own  figures,  and  its 

statisticians  are  thoroughly  aware  of  the  properties  of  various 

proposed  ways  of  adjusting  for  the  undercount.  No  agency  of 

government     contains     more    knowledgeable,    skilled,    and 

devoted  technicians  than  the  Census  Bureau.  Its  methods  of 

survey    and   sampling   have   been    adopted    throughout   the 

world.    But    it   would    be  wrong  to  ask  the  Bureau  to  go 

beyond  technical  decisions  to   political  ones,  and  we  have 

here  a  problem  for  which  technical  considerations  do  not 

provide  a  unique  answer. 

There  are  several  distinct  ways  to  adjust  the  census,  any 

one  of  which  will,  on  the  average,  come  closer  to  the  actual 

number  of  people  present,  jurisdiction  by  jurisdiction,  than 

the  census   as   enumerated.   Each  could   be  the  basis  of  a 

convention.  The  several  possible  conventions  fall  into  three 

groups: 


1 .  Accepting  the  census  as  enumerated! 

2.  Adjusting  the  census  count  by  a  simple  objective 
method  that  anyone  can  apply  to  all  jurisdictions  and 
obtain  a  unique  answer. 

3.  Using  all  existing  data  and  doing  the  best  one  can, 
jurisdiction  by  jurisdiction,  and  foregoing  the  unique- 
ness and  objectivity  of  convention  2. 

1.  The  Census  as  Enumerated.  The  first  of  these  has  been 
the  convention  of  the  past.  The  census  number  is  the  count 
of  all  those  persons  of  whose  individual  existence  the  Bureau, 
using  its  standard  procedures,  has  evidence.  Like  the  gold 
standard,  it  is  a  superstition,  perhaps,  but  one  that  avoids 
inflation  and  confusion.  As  used  in  the  past,  it  has  been 
sustained  by  the  courts  and  prevents  any  suspicion  that  the 
Bureau  of  the  Census  is  fudging  its  figures.  The  uncounted 
are  those  who  were  not  found  with  the  effort  mounted  by 
skilled  census  takers  operating  with  a  given  finite  budget.  On 
this  convention,  those  not  found  are  treated  as  not  present; 
the  only  argument  that  could  arise  would  concern  whether 
the  census  effort  is  large  enough  or  whether  its  budget  should 
be  increased. 

2.  A  Simple  Objective  Adjustment.  The  simplest  example 
of  the  second,  a  uniform  method  that  would  take  out  the 
major  differential  of  completeness  for  minorities,  is  to  give 
those  minorities  in  each  jurisdiction  a  bonus  equal  to  their 
relative  undercount  for  the  country  as  a  whole.  If  this  had 
been  applied  in  1970,  each  black  would  have  been  counted  at 
1.06  in  reckoning  the  population  of  jurisdictions  for  revenue 
sharing  and  other  purposes.  Failing  any  way  of  making 
similar  calculation  for  Hispanics,  they  too  would  have  been 
given  a  bonus  of  the  same  6  percent.  Other  minorities  would 
have  remained  unadjusted.  No  one  has  any  idea  of  what  1980 
number  will  correspond  to  the  6  percent  for  1970,  and  we 
will  have  to  wait  2  or  3  years  to  find  out.  It  may  be  less  than 
6  percent,  because  the  census  effort  is  greater;  it  may  be 
more,  because  new  problems  have  arisen. 

While  this  adjustment  removes  the  bias  for  the  country 
as  a  whole,  it  by  no  means  ensures  accuracy  in  indivi- 
dual jurisdictions  [8] .  It  is  a  synthetic  method,  in  the 
sense  that  it  applies  a  ratio  derived  from  one  area— the  United 
States  as  a  whole— to  another  area— the  particular  juris- 
diction. Possibly  it  can  be  improved  by  taking  the  ratio  not 
from  the  United  States  as  a  whole,  but  from  the  particular 
region  or  State.  Thus,  it  would  seem  more  appropriate  to  use 
the  undercount  of  New  York  State  in  adjusting  New  York 
City  than  the  undercount  for  the  country  as  a  whole.  The 
trouble  is  that  undercount  for  New  York  State  is  not  as 
accurately  known  as  that  for  the  United  States.  The 
calculation  depends  on  estimates  of  migration,  and  the 
migration  among  States  is  unrecorded,  while  that  for  the 
United  States  as  a  whole  is  at  least  in  large  part  recorded,  so 
the  undercount  ratio  will  always  be  better  known  for  the 
country  than  for  each  State.  The  decision  that  is  preferable 
cannot  be   made  without  extensive  experimenting,  and  the 
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Census  Bureau  is  in  a  better  position  to  do  such  experi- 
menting than  any  other  body. 

There  are  still  further  possibilities  of  improving  the 
estimates  for  jurisdictions  on  an  objective  basis,  and  with  a 
method  that  would  be  applied  uniformly  to  all  39,000 
jurisdictions.  That  is.  carry  out  a  postenumeration  survey 
of  a  kind  used  in  1970,  but  larger,  in  which  a  sample  of 
areas  containing  some  hundreds  of  thousands  of  people 
would  be  enumerated  a  second  time,  independently  of  the 
census,  but  very  close  in  time  to  April  1,  1980.  The 
enumerators  who  would  do  the  PES  would  be  selected 
among  the  very  best  of  those  hired  for  the  census;  they 
would  be  given  further  training;  they  would  be  better  paid 
and  allowed  more  time  to  search  for  people,  both  in 
apparently  vacant  houses  and  houses  known  to  be  occupied. 
They  would  certainly  do  a  better  job  than  the  original  census 
takers  [1] . 

The  PES  areas  would  then  be  compared  name  by  name 
with  the  census.  If  the  PES  were  perfectly  complete,  and  if 
the  matching  of  names  could  be  carried  out  exactly,  then  one 
would  know  not  only  what  fraction  of  blacks  and  Hispanics 
were  omitted,  but  also  omissions  of  people  of  the  several 
ages,  occupations,  income  groups,  etc.  These  omission  rates 
could  then  be  applied  to  jurisdictions.  The  method  would 
still  be  synthetic,  in  that  the  omission  rates  applied  to  any 
given  jurisdiction  would  not  be  obtained  from  that  juris- 
diction, but  would  have  to  be  derived  from  some  larger 
area  like  an  entire  State  or  region.  With  the  assumptions 
above,  there  would  be  a  clear  gain  in  accuracy  over  using  the 

undercount  as  calculated  without  the  PES. 

Unfortunately,  the  assumptions  are  very  strong.  The  PES 
enumerators  do  better,  but  their  work  is  not  perfect.  And  the 
matching  of  people  they  turned  up  with  those  appearing  in 
the  census  is  subject  to  considerable  error,  whether  done  by 
hand  or  by  machine.  Names  are  misspelled  and  people  move 
even  in  the  few  days  that  elapse  between  the  census  and  the 
PES.  Most  persons  are  properly  enumerated,  and  the  search  is 
for  the  2  or  3  percent  missed.  If  there  is  doubt  on  the 
matching  of  5  or  10  percent  of  the  names,  the  undercount  is 
lost  amid  the  matching  errors.  Once  again  experimenting  will 
be  needed  to  see  if  the  PES  can  provide  better  ratios  than  the 
calculation  using  births.  Medicare  registrants,  and  previous 
censuses. 

With  the  knowledge  now  in  existence,  one  cannot  decide 
which  of  the  above  methods  will  give  the  best  results  for 
individual  jurisdictions,  but  we  can  say  where  the  burden  of 
proof  lies.  We  know  that  the  simplest  method  of  taking  out 
the  bias  (say,  multiplying  blacks  and  Hispanics  by  1.06)  will 
improve  the  census  to  some  degree.  Only  more  complex 
methods  as  will  demonstrably  improve  on  that  improvement 
should  be  considered. 

3.  Using  All  Existing  Data  and  Trusting  to  Judgment.  All 

of  the  above  concern  the  second  approach,  in  which  an 
objective  method  is  agreed  on  or  legislated  and  applied  in  a 
perfectly   mechanical  fashion  to  all  jurisdictions.  The  third 


kind  of  convention  would  put  its  trust  in  one  agency,  either 
the  Bureau  of  the  Census  itself  or  some  other  prestigious 
technical  group.  That  body  would  do  the  best  it  could  for 
each  of  the  39,000  jurisdictions.  In  some  areas,  it  might  have 
good  building  permits  that  would  enable  it  to  detect  omitted 
new  dwellings,  and  it  would  adjust  for  these.  In  other  areas, 
there  might  be  a  local  census  that  the  technical  group  judged 
reliable;  it  would  turn  to  that.  There  are  many  other  ways  in 
which  it  could  use  judgment  on  what  ancillary  data  to  bring 
in,  and  the  knowledge  and  experience  of  its  personnel  would 
be  such  that  it  would  surely  obtain  a  better  result  than 
applying  a  uniform  method  throughout. 

But  judgment  is  the  key  word  here.  Others  would  judge 
differently,  and  given  the  partisanship  that  the  distribution 
of  funds  creates,  the  only  way  of  getting  a  unique  and 
workable  result  out  of  this  third  convention  is  to  agree  that 
the  population  of  each  of  the  jurisdictions  is  what  the 
appointed  technical  group  says  it  is.  Its  choice  of  ancillary 
data  could  not  be  opened  to  criticism  or  review  by  the 
jurisdictions  concerned,  because  they  could  easily  come  up 
with  a  result  more  favorable  to  themselves.  If  one  wants  the 
kind  of  improvement  in  accuracy  that  this  way  of  doing 
things  can  provide,  one  has  to  forego  the  right  to  check  the 
technical  group.  Those  who  insist  on  checking  the  judgment 
of  the  technical  group  against  their  own  judgment  are  giving 
up  the  possibility  of  a  unique  solution.  They  should 
properly  revert  to  the  second  type  of  convention,  where  the 
calculation  is  simple  and  perfectly  objective. 

HOW  TO  CHOOSE  THE  CONVENTION 

The  main  concern  is  the  integrity  of  the  census.  As  long  as 
that  is  preserved,  we  can  live  with  any  of  the  methods.  Even 
if  it  were  decided  that  convention  1  was  to  be  followed,  the 
minority  groups  need  not  suffer.  Congress,  noting  that  the 
undercount  handicaps  them,  could  merely  change  the  for- 
mulas used  on  the  several  grants.  It  could,  for  example,  add 
to  what  each  jurisdiction  receives  an  amount  equal  to  6 
percent  times  its  fraction  of  blacks.  Convention  2  is  objective 
and  open  to  verification  by  those  concerned.  Convention  3 
would  give  a  result  closer  to  the  actual  population,  but  would 
require  that  everyone  involved  trust  the  technical  group 
(Census  Bureau  or  other)  that  did  the  work  and  forego  the 
right  of  review.  Each  has  its  advantage. 

What  will  be  fatal  is  indecision  as  among  the  methods. 
Since  for  any  given  jurisdiction  they  give  different  numerical 
results,  and  none  is  in  any  sense  "right,"  indecision  will  be  an 
invitation  to  each  jurisdiction  to  press  its  own  way  of 
calculating  the  undercount.  It  will  have  recourse  to  the 
courts,  which  do  not  know  the  true  population  any  more 
than  the  rest  of  us  do,  and  which  could  well  come  up  with 
different  solutions  in  different  cases  and  so  establish  mutu- 
ally contradictory  precedents.  The  reputation  of  the  Bureau 
of  the  Census  could  hardly  stand  up  against  the  host  of 
pressures.  In  effect,  millions  of  dollars  in  reward  would  be 
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offered  to  a  jurisdiction  that  could  prove  the  census  wrong. 
No  institution  has  a  reputation  solid  enough  to  stand  against 
that  kind  of  incentive  system. 

How  to  avoid  the  threatened  confusion  and  danger  to  the 
Census  Bureau's  integrity  ought  to  be  a  preoccupation  of 
citizens  and  of  Congress.  It  can  be  protected  by  dividing  the 
task  between  the  part  that  is  technical  and  the  part  that  is 
political  and  assigning  each  to  a  competent  body.  An  answer 
that  has  been  suggested,  and  with  which  I  am  thoroughly  in 
agreement,  would  involve  just  two  steps: 

1 .  Ask  the  Census  Bureau  to  set  down  the  options  for 
adjustment.  It  would  be  able  to  describe  technically 
respectable  ways,  falling  under  the  three  heads  above, 
in  much  more  detail  and  with  much  more  knowledge 
than  this  writer  can  furnish.  It  would  assure  the 
authorities  that  any  of  the  methods  of  adjustment  it 
set  down  would  provide  estimates  for  jurisdictions  that 
on  the  average  (not  necessarily  in  each  separate 
jurisdiction)  would  come  closer  to  the  true  population 
than  the  unadjusted  1980  figure.  It  could  easily  offer 
more   than   one    method   falling  under  convention  2 

above. 

2.  Either  the  legislature  or  some  other  representative  (i.e., 

political)  body,  not  primarily  consisting  of  technicians, 
but,  of  course,  able  to  consult  them,  would  determine 
which  of  the  methods  listed  under  (1)  would  be  used. 
Alternatively,  it  might  decide  on  conventions  1  (no 
adjustment)  or  3  (entrusting  the  whole  matter  to  the 
Census  Bureau  or  some  other  technical  body,  without 
the  privilege  of  criticizing  its  resulting  numbers). 

This  seems  the  most  orderly  way  to  take  advantage  of  the 
technical  skill  of  the  Bureau  of  the  Census,  and  to  avoid 
burdening  it  with  political  responsibility. 
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BACKGROUND 


For  more  than  half  a  century,  the  Bureau  of  the  Census 
has  created  and  shaped  a  statistical  environment.  The 
decennial  census  is  the  largest,  most  expensive,  and  from  the 
public's  point  of  view,  the  most  visible  part  of  that 
environment.  Like  the  physical  environment,  the  census 
touches  the  life  of  each  American  and  more  than  just  once 
every  10  years.  Federal  and  State  fund  allocations  and 
corporate  expenditures  are  influenced  daily  by  census  results. 
Any  significant  change  in  the  way  the  census  is  taken  or 
tabulated  will  affect  the  political,  legal,  institutional,  and 
financial  environment  of  private  and  public  activities.  There 
can  be  no  doubt  that  the  planned  adjustment  of  decennial 
census  figures  for  the  lack  of  a  100-percent  count  is  a 
significant  change  from  present  policy  that  will  have 
far-reaching  and  long-lasting  consequences. 

Since  the  passage  of  the  National  Environmental  Policy 
Act  of  1969,  any  project  or  action  that  may  affect  the 
environment  requires  an  environmental  impact  statement. 
The  drafters  of  that  statute  probably  were  thinking  more 
about  river-course  adjustment  than  statistical  adjustment,  but 
the  manner  in  which  vast  sums  of  money  are  spent  through 
the  use  of  census  statistics  affects  us  just  as  directly  as  a 
diverted  river.  The  1969  act  recognizes  this  when  it  states 
that  the  Federal  Government  shall  improve  and  coordinate 
its  plans  and  programs  so  as  to  "achieve  a  balance  between 
population  and  resource  use  which  will  permit  high  standards 
of  living  and  a  wide  sharing  of  life's  amenities." 

An  environmental  impact  statement  requires  a  detailed 
description  of  the  following: 

1.  The  proposed  action 

2.  The  present  environment 

3.  The  expected   impact  of  the  proposed  action  on  the 
present  environment 

The  description  of  the  proposed  action  details  its  purpose, 
nature,  and  extent,  as  well  as  its  timing  and  methods  of 
execution.  The  description  of  the  present  condition  includes 
sufficient  detailto  discover,  insofar  as  possible  before  any 
action,  all  life  forms  or  other  things  of  a  unique  and 
irreplaceable  nature  that  might  be  affected  by  the  proposed 
action.  In  addition  to  a  general  discussion  of  the  impact  of 
the  proposed  action  on  the  environment,  the  third  part  of 
the  statement  includes  details  of: 

a.  Any  unavoidable  adverse  effects 


b.  Any  irreversible  commitment  of  resources 

c.  The  possible  impact  on  long-term  use  or  productivity 

d.  Any  mitigating  measures  that  might  be  taken 

e.  Any  alternatives  to  the  proposed  action 

The  main  part  of  this  paper  will  follow  the  format  outlined 
above  in  an  effort  to  determine  the  impact  of  adjusting  for 
census  undercount. 

THE  PROPOSED  ACTION 

In  June  of  1979,  Senator  Daniel  P.  Moynihan  introduced  a 
bill  (S.B.  1606)  that  would  have  amended  title  13  of  the  U.S. 
Code  to  say  "In  conducting  that  census  in  1980  the 
Secretary  shall  adjust  the  population  figures,  employing  the 
best  available  methodology  to  correct  for  undercounting. 
These  adjusted  population  figures  shall  be  used  by  every 
Federal  officer  or  employee  administering  a  program  under 
which  funds  or  benefits  are  allocated  or  distributed  among 
States  or  other  units  of  government  on  the  basis  of 
population." 

The  Panel  on  Decennial  Census  Plans,  convened  by  the 
National  Academy  of  Sciences  in  1977,  concluded  in 
recommendation  23  that  "inequities  resulting  from  the 
geographic  differentials  in  the  decennial  census  undercount 
could  be  reduced  by  adjustment  of  the  data  for 
underenumeration.  Methods  of  adjustment  with  tolerable 
accuracy  are  feasible.  While  the  application  of  these  methods 
has  some  arbitrary  features  and  while  the  figures  for  some 
areas  would  not  be  made  closer  to  the  correct  distribution  of 
population,  the  panel  believes  that  on  balance  an 
improvement  in  equity  would  be  achieved."  The  panel  goes 
on  to  say  that  "If  the  Secretary  of  Commerce  agrees  with  the 
panel's  conclusion"  the  Census  Bureau  should  be  directed  by 
the  Secretary  of  Commerce  to  "adjust  for  underenumeration 
the  counts  for  total  population  of  the  United  States,  the 
States,  and  local  areas,  for  use  in  distributing  funds.  The 
adjustments  would  not  be  applied  to  the  counts  used  for 
legislative  apportionment  nor  to  the  body  of  census  data  on 
the  characteristics  of  the  population." 

These  are  only  two  of  a  number  of  proposals  for 
undercount  adjustments,  but  they  reflect  the  general 
consensus  on  the  purpose  and  limitations  of  such  adjustment. 
The  purpose  is  to  foster  greater  equity  in  the  distribution  of 
Federal  funds.  Prior  to  the  widespread  use  of  census  statistics 
for  allocating  funds,  undercount  adjustment  was  hardly  ever 
an    issue.    But,    during    the    next    decade,   the   amount  of 


37 


38 


money— billions  of  dollars— to  be  allocated  to  State  and  local 
governments  will  be  large  enough  for  them  to  be  asking,  "Are 
we  getting  our  fair  share?" 

The  two  limitations  most  generally  agreed  upon  are  that 
the  adjusted  figures  would  not  be  used  for  reapportionment 
and  that  adjustments  would  be  made  only  to  the  total 
population  for  State  and  local  governmental  entities.  The 
first  limitation  is  imposed  primarily  because  of  time 
constraints.  It  would  be  technically  impossible  to  produce 
estimates  of  the  undercount  in  time  to  give  the  President 
reapportionment  figures  by  January  1,  1981.  Besides, 
population  data  for  city  blocks  (an  area  too  small  for  any 
adjustment  procedure  now  considered)  are  used  to  draw 
congressional  district  boundaries.  The  second  limitation 
comes  about  because  many  of  the  other  items  of  data  from 
the  census  (such  as  income)  suffer  more  from  underreporting 
than  undercount. 

It  will  be  assumed  here  that  adjustment  would  be  limited 
to  the  total  population,  but  that  characteristics  such  as  age, 
race,  sex,  and  location  (i.e.,  urban  versus  rural)  would  be 
taken  into  consideration  in  the  adjustment  procedure.  The 
geographic  areas  to  be  considered  for  adjustment  are  States, 
SMSA's,  and  revenue-sharing  jurisdictions.  The  Bureau  of  the 
Census  current  plans  for  1980  census  coverage  evaluation 
will  produce  estimates  of  undercount  for  the  Nation,  regions, 
divisions,  States  (including  the  District  of  Columbia),  and 
probably  the  20  largest  SMSA's.  Estimates  of  undercount  for 
the  smaller  revenue-sharing  areas  would  require  the  use  of  a 
different  method  than  the  one  used  for  large  areas.  Most 
likely  it  would  be  a  synthetic-  or  regression-estimation 
technique. 

The  methods  currently  being  contemplated  for  estimating 
census  undercount  for  States  and  other  large  areas  are 
demographic  analysis  and  a  postcensus  sample  survey  known 
as  the  postenumeration  survey  (PES).  Demographic  analysis 
has  already  begun.  It  involves  analyzing  pastcensus  data  and 
administrative  records  to  develop  an  expected  population. 
The  PES  will  be  an  independent  sample  survey  of  some 
250,000  households  to  begin  shortly  after  the  census.  The 
fieldwork  for  the  survey  is  expected  to  take  about  6  months. 
Processing  the  survey  results  and  matching  the  results  with 
administrative  records  to  produce  accurate  State-level 
undercount  estimates  is  expected  to  take  until  the  fall  of 
1983  or  the  beginning  of  1984.  Since  additional  work  would 
have  to  be  done  to  actually  calculate  the  undercount 
adjustments  and  apply  them  to  each  geographic  unit,  it  seems 
likely  that  the  adjusted  population  figures  would  be  available 
around  April  1984.  It  is  assumed  here  that  all  of  the 
statistical  work  required  for  undercount  adjustment  would 
be  done  by  the  Bureau  of  the  Census  according  to  the 
methods  described  in  the  paper  by  Bateman  and  Cowan.  [1] 
Minor  changes  in  the  method  by  which  the  undercount  is 
estimated  are  not  expected  to  significantly  alter  the 
consequences  of  the  undercount  adjustment. 

In  summary,  the  key  points  of  the  proposed  actions  are: 


1.  Population  totals  from  the  1980  census  would  be 
adjusted  for  an  estimated  undercount  for  the  entire 
United  States,  the  States,  the  SMSA's,  and 
approximately  39,000  local  governments  that  receive 
congressional  reapportionment. 

2.  Adjusted  population  figures  would  be  used  for 
allocations  of  Federal  funds  but  not  for 
reapportionment. 

3.  The  estimation  of  the  undercount  and  associated 
statistical  work  would  be  done  by  the  Bureau  of  the 
Census,  using  a  large  postcensus  survey  and 
demographic  analysis;  this  work  is  expected  to  take 
about  3  years  from  the  date  of  the  census. 

4.  Only  population  totals  would  be  adjusted  for  the 
undercount,  but  variables  such  as  age,  race,  and 
location  would  be  considered  in  the  adjustment 
procedure  to  account  for  differentials  in  the 
undercount. 

THE  PRESENT  SITUATION 

Attitudes  Toward  the  Census 

Aside  from  undercount  adjustment  procedures,  the  1980 
census  will  be  conducted  and  the  figures  tabulated  as  in  1970, 
but  with  few  significant  differences.  Substantially  more 
money  ($200  million)  has  been  spent  prior  to  this  census 
than  before  any  previous  one  on  a  massive  effort  to  improve 
the  rate  of  response.  The  Census  Bureau  has  obtained  the 
cooperation  of  groups  such  as  the  National  Assoication  of 
Broadcasters,  the  Boy  Scouts  of  America,  the  National 
Football  League,  and  others  to  deliver  the  message  on  the 
importance  of  responding  to  the  census.  In  addition,  the 
Bureau's  Community  Services  Program  will  encourage 
grass-roots  support  for  the  census,  and  the  Local  Review 
Program  will  give  municipalities  a  chance  to  review 
preliminary  census  figures.  This  direct  involvement  in  the 
census  by  local  organizations  is  expected  to  encourage 
cooperation  as  well  as  increase  confidence  in  the  results. 

However,  there  have  also  been  some  changes  since  1970  in 
the  American  public's  perception  of  its  Government.  It  has 
become  a  cliche  to  refer  to  the  late  1970's  as  the 
"post-Watergate"  period,  but  there  is  no  question  that  the 
credibility  of  Government  officials  has  been  damaged.  This, 
in  combination  with  the  Government's  inability  to  deal  with 
the  problems  of  inflation  and  energy,  has  created  a  mood  of 
alienation  and  hostility.  So  many  public  statements  by 
Federal  officials  have  turned  out  to  be  "inoperative"  at  some 
later  time  that  any  statement,  no  matter  how  factual,  is 
viewed  with  suspicion.  Thus,  if  the  Director  of  the  Census 
Bureau  says,  quite  accurately,  "Your  census  questionnaire  is 
confidential,"  a  not  inconsequential  fraction  of  the  public 
says,  "Who  are  you  trying  to  fool?"  Among  professional 
people  involved  with  statistics,  the  Bureau  of  the  Census  is 
viewed  as  an  agency  of  unquestionable  integrity,  but  much  of 
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the  public  makes  no  distinction  between  one  agency  of  the 
Federal  Government  and  another. 

In  Congress,  the  Census  Bureau  has  been  fair  game  for 
some  time.  Prior  to  the  1970  census  it  was  Congressman 
Betts,  in  1977  it  was  Congressman  Lehman.  The  objective  of 
the  original  Lehman  bill  was  quite  plain:  to  wrest  control  of 
the  population  figures  away  from  the  Census  Bureau.  A  Wall 
Street  Journal  editorial  on  the  subject  was  appropriately 
headlined,  "Pork  Barrel  Census." 

There  is  no  doubt  that  other  attempts  will  be  made  to 
control  the  information  that  the  Census  Bureau  produces. 
The  stakes  are  high,  and  the  Census  Bureau  is  hardly  ready 
for  political  combat.  Except  during  census  time,  it  has  only  a 
small  number  of  employees,  and  most  of  them  are  in 
Washington.  Its  annual  budget  is  an  insignificant  fraction  of 
total  Federal  outlays,  and  it  lacks  a  powerful  constituency 
such  as,  for  example,  the  Department  of  Defense  enjoys.  Its 
vulnerability,  in  combination  with  its  great  value  as  the  only 
objective  "factfinder  for  the  Nation,"  gives  the  Census 
Bureau  the  characteristics  of  an  endangered  species. 

Census  Tabulations 

The  census  will,  of  course,  take  place  around  the  first  of 
April.  By  January  1,  1981,  official  population  totals  will  be 
given  to  the  President  for  determining  the  number  of 
representatives  from  each  State. 

Some  time  early  in  1981  the  first  set  of  computer  tapes 
will  be  available.  These  tapes  will  contain,  in  addition  to 
population  totals,  several  hundred  tabulations  of  population 
and  housing  data  from  the  Ri-,ort-form  questionnaire.  If  the 

1970  census  is  any  indication,  by  the  end  of  1981  thousands 
of  reels  of  data  will  have  been  sold  by  the  Census  Bureau, 
summary  tape  processing  centers,  State  Data  Centers,  and 
others.  In  addition  to  being  copied  many  times,  these  tapes 
will  be  manipulated  in  every  conceivable  way  to  produce 
statistics  in  forms  to  match  users'  needs.  By  the  end  of  1982, 
all  summary  tapes  and  most  printed  reports  will  have  been 
published  and  distributed. 

Federal  and  State  Use  of  the  Census 

Census  statistics  have  been  used  for  governmental 
purposes  since  the  first  census  in  1790,  but  the  widespread 
use  of  census  data  for  funds  allocation  and  business  research 
is  a  recent  phenomenon.  At  the  Federal  level,  many  of  the 
very  powerful  political  coalitions  that  operated  in  the  past 
have  virtually  disappeared.  With  the  iack  of  any  strong  and 
lasting  political  basis  for  the  distribution  of  money,  the 
Federal  Government  has  turned  to  using  an  unbiased 
measuring  stick— census  statistics.  Money  is  still  handed  out 
on  a  "discretionary"  basis,  but  more  and  more  transfers  of 
funds  from  the  Federal  to  the  local  level  are  based  on  census 
figures. 


Over  a  hundred  Federal  programs  use  census  statistics  in 
their  allocation  formulas,  but  one  of  the  largest  and  most 
visible  programs  is  revenue  sharing.  Begun  in  1972,  it  sends 
money  directly  from  the  Treasury  Department  to  State  and 
local  governments  based  on  a  formula  that  uses,  among  other 
factors,  population  and  per  capita  income.  It  is  in  this 
program  that  equity  was  thought  to  be  most  affected  by  an 
undercount.  About  39,000  local  governments  have  been 
receiving  about  $7  billion  a  year  under  this  program.  The 
following  table  shows  the  number  of  revenue-sharing 
recipients  in  1976  by  population  size: 


Number 

Percent 

Population 

of  areas 

of  total 

Total 

39,279 

100.0 

Over  1 ,000,000 

68 

0.2 

500,000  to  999,999  .  .  . 

83 

0.2 

100,000  to  499,999... 

469 

1.2 

50,000  to  99,999 

659 

1.7 

10,000  to  49,999 

4,185 

10.7 

2,500  to  9,999 

6,685 

17.0 

1,000  to  2,499 

7,473 

19.0 

Under  1,000   

19,657 

50.0 

We  can  see  that  more  than  two-thirds  of  all  revenue- 
sharing  recipients  are  smaller  than  an  average-size  census 
tract,  and  half  of  them  are  about  the  size  of  an  enumeration 
district.  However,  there  are  still  over  a  thousand  communities 
of  50,000  or  more  people  receiving  these  funds,  and  there 
can  be  no  doubt  that  many  of  them  have  come  to  depend  on 
this  money  to  keep  down  property  taxes.  Unless  replaced 
with  a  similar  program,  revenue  sharing  is  likely  to  be  with  us 
for  a  very  long  time. 

Many  Federal  programs  use  census  statistics  to  distribute 
funds  for  nonmunicipal  areas  such  as  neighborhoods  or 
special-benefit  districts.  The  Department  of  Housing  and 
Urban  Development  has  several  renewal,  rehabilitation,  and 
rent-subsidy  programs  targeted  at  specific  neighborhoods, 
some  containing  only  one  or  two  census  tracts.  Special 
education  funds  go  to  school  districts,  which  seldom  follow 
municipal  boundaries,  and  funds  for  improving  water  quality 
go  to  water  districts,  which  may  not  be  units  of  local 
government. 

In  addition  to  funds-allocation  uses,  virtually  all  Federal 
agencies  use  census  statistics  for  basic  research,  for  develop- 
ment of  new  programs,  as  well  as  for  regulatory  purposes. 
The  Office  of  Federal  Contract  Compliance  Programs  and  the 
Equal  Employment  Opportunity  Commission  make  heavy 
use  of  census  statistics  by  ethnicity,  race,  and  sex  to 
determine  if  there  has  been  discrimination  in  hiring  practices. 

For  every  use  of  census  data  in  the  Federal  Government 
there  can  be  found  a  parallel  use  in  State  governments.  State 
and  county  legislative  districts  are  drawn  using  census 
population  figures,  and  about  12  States  including  the  two 
largest— California   and    New   York— make   extensive  use  of 
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decennial  population  figures  for  their  own  form  of  revenue 
sharing.  These  States  and  all  the  others  also  use  the  data  for 
land-use  planning  and  regulatory  purposes.  For  example,  a 
town  in  New  York  must  have  a  certain  population  before  it 
can  set  its  own  highway  speed  limits  and  perform  other  local 
government  functions. 

All  States  except  Massachusetts  participate  rn  a 
cooperative  program  with  the  Bureau  of  the  Census  to 
produce  annual  estimates  of  population  for  each  county  and 
SMSA  within  their  borders.  The  decennial  census  is,  of 
course,  the  benchmark  for  these  estimates.  There  can  be  little 
doubt  that  following  the  Federal  pattern.  States  will 
continue  to  make  extensive  use  of  these  statistics  in  their 
research,  planning,  and  relationships  with  their  local  govern- 
ment Units.  / 

Business  Use  of  Census  Data 

The  use  of  census  statistics  has  grown  substantially  in  the 
last  decade.  This  increase  in  reliance  on  demographic  data 
coincides  with  the  maturing  of  the  American  economy  and 
the  pervasive  use  of  computers.  Business  planners  are  now 
frequently  confronted  with  nearly  saturated  markets.  In  that 
kind  of  environment,  market  research  becomes  an  essential 
part  of  staying  in  business.  In  addition,  new  government 
regulations  regarding  equal  employment  opportunities 
require  that  companies  show  with  census  statistics  that  they 
are  not  discriminating. 

Fortunately,  this  increase  in  the  complexity  of  doing 
business  has  been  accompanied  by  the  widespread  and 
relatively  inexpensive  availability  of  computer  technology 
along  with  the  census  data  on  computer  tape.  During  the  last 
decade,  several  private  companies  were  formed  to  serve 
business's  need  for  demographic  statistics.  From  almost 
nothing,  the  demographic  data  business  has  grown  to  the 
point  that  revenues^are  now  conservatively  estimated  at 
between  $10  and  $20  million  annually  and  are  expected  at 
least  to  double  by  1983.  By  American  business  standards, 
those  are  not  particularly  large  revenues,  but  they  have  a 
substantial  multiplier  effect.  Corporate  decisions  involving 
billions  of  dollars  of  capital  expenditures  are  made  on  the 
basis  of  whether  there  is  a  large  enough  consumer  market  in 
the  right  place  at  the  right  time.  Any  market  researcher  or 
corporate  planner  would  agree  that  the  foundation  of 
consumer  information  is  the  decennial  census.  It  may  take  a 
specialized  survey  to  find  out  what  kind  of  people  will  buy  a 
particular  product  and  how  much  of  it,  but  it  is  the  census 
that  reveals  how  many  buyers  are  out  there  and  where  they 
are. 

For  a  variety  of  reasons,  business's  use  of  census  statistics 
is  not  as  visible  as  government's.  Because  of  competition, 
corporate  planners  are  reluctant  to  divulge  which  census 
statistics  they  use  and  exactly  how  prominently  they  figure 
in  the  decision  process.  The  data  they  use  are  frequently 
purchased  from  a  data  vendor  and  not  directly  from  the 
Census  Bureau.  Also,  there  are  many  thousands  of  corporate 


establishments,  each  making  independent  business  decisions 
about  where  to  expand,  where  to  contract,  what  to  buy,  and 
what  to  sell.  No  one  could  possibly  keep  track  of  the 
individual  decisions;  they  only  become  visible  in  the 
aggregate. 

Without  going  into  case  histories,  several  general  state- 
ments can  be  made  about  the  business  use  of  census 
statistics.  First  of  all,  extreme  accuracy  is  not  essential.  It  is 
not  realistic  to  expect  a  business  decision  to  turn  on  whether 
the  percentage  of  potential  buyers  was  36  or  36.7  percent  of 
the  population,  or  whether  the  market  size  was  1.2  or  1.3 
million.  No  one  can  predict  actual  consumer  behavior  that 
accurately,  and  as  a  result,  a  substantial  margin  of  error  is 
usually  assumed  for  planning  purposes.  Second,  timeliness  is 
important.  If  1-  to  3-year-old  children  are  the  market  for 
your  product,  4-year-old  census  data  are  not  adequate. 
Businesses  will  make  maximum  use  of  the  census  in  1981  and 
1982  and  rely  more  on  estimates  after  that.  Third,  the  trend 
is  often  more  important  than  the  present  condition.  If  the 
census  shows  a  market  size  that  is  sufficient  now,  the 
investment  decision  will  be  different  if  that  market  is 
perceived  to  be  growing  rather  than  declining.  Fourth, 
markets  for  restaurants,  stores,  branch  banks,  and  similar 
type  of  establishments  rarely,  if  ever,  follow  any  political 
boundaries.  That  sort  of  business  uses  much  more  small-area 
data  (tracts,  enumeration  districts,  or  block  groups). 

Finally,  for  most  consumer  products  and  services  the 
characteristics  of  the  population  are  more  important  than  the 
size  of  the  total  population.  Most  consumer  products  are 
purchased  by  only  a  segment  of  the  population,  which  can  be 
described  in  terms  of  income,  age,  type  of  dwelling  unit,  etc. 

Both  business  and  government  are  relying  on  the  census 
more  every  year.  As  the  marketplace  becomes  more  compli- 
cated and  unpredictable,  business  people  will  purchase  and 
use  more  statistical  data  on  consumers  to  assist  in  making 
business  decisions.  As  the  fragmentation  of  American  society 
continues,  governments  at  all  levels  will  turn  to  census 
statistics  to  mediate  among  the  many  diverse  groups,  each  of 
whom  wants  its  share  of  benefits.  In  the  midst  of  these 
competing  demands  is  the  Bureau  of  the  Census.  Although 
highly  regarded  for  its  professionalism  and  total  confi- 
dentiality, the  Census  Bureau  finds  itself  in  an  uncomfortable 
situation.  The  census  statistics  it  produces  have  become  so 
valuable  that  demands  for  their  accuracy  may  exceed  the 
Census  Bureau's  ability  to  obtain  them,  given  the  attitude  of 
the  public  today.  At  the  same  time,  the  control  of  these 
statistics  is  looked  upon  by  some  as  the  key  to  the  Treasury. 
It  is  in  this  climate  that  proposals  for  undercount  adjust- 
ments have  surfaced. 

THE  IMPACT  OF  UNDERCOUNT  ADJUSTMENT 

Design  Impact 

The  major  purpose  expressed  in  undercount  adjustment 
proposals  is  to  produce  greater  equity  in  the  distribution  of 
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Federal  funds.  This  can  therefore  be  construed  as  the  desired 
impact.  In  an  analogous  situation,  the  desired  impact  of  a 
new  highway  may  be  to  reduce  traffic  congestion  on  local 
streets. 

The  National  Academy  Panel  on  Decennial  Census  Plans 
reports  that  there  are  many  Federal  programs  that  use 
population  or  per  capita  income,  and  that  the  largest  one  of 
these  is  general  revenue  sharing.  In  this  program,  however, 
income  has  a  greater  weight  than  population;  there  is  a  fixed 
amount  of  money,  and  there  are  special  constraints  that  limit 
how  much  a  recipient  may  gain  or  lose  in  funds.  The  panel 
concludes  that,  as  far  as  the  revenue-sharing  program  is 
concerned,  "adjusting  the  population  count  without 
simultaneously  adjusting  the  income  data  for 
underenumeration  (or  underreporting  of  income)  could 
result  in  little  or  no  improvement  in  the  equity  of  the 
distribution  of  funds  under  this  program."  If  the  single 
largest  formula  grant  program  (accounting  for  almost  20 
percent  of  the  Federal  funds  disbursed  in  this  manner)  is  not 
going  to  be  materially  affected  by  population  undercount 
adjustment,  then  it  would  appear  that  the  desired  impact  is 
somewhat  muted.  Considering  the  fact  that  nearly  70  percent 
of  the  revenue-sharing  recipients  have  a  population  count 
under  2,500,  where  any  adjustment  methodology  is  least 
certain,  the  impact  may  be  negligible. 

Many  of  the  other  Federal  grants  from  HEW  or  HUD  are 
categorical  and  for  very  specific  purposes;  it  is  doubtful  that 
adjusting  population  counts  would  significantly  affect  the 
equity  of  funds  distribution  for  those  programs.  Also,  these 
categorical  grants  generally  do  not  apply  to  the  small 
communities  mentioned  above. 

However,  there  are  programs  at  both  the  State  and 
Federal  levels  in  which  funds  are  distributed  solely  or 
primarily  on  the  basis  of  population.  Here  the  undercount 
adjustment  would  certainly  have  an  impact.  What  cannot  be 
determined  is  the  political  result  of  the  potential  situation  in 
which  a  minority  of  local  governments  get  more  money  and  a 
majority  get  less. 

Unavoidable  Adverse  Impact 

One  result  of  undercount  adjustment  would  be  the 
creation  of  two  sets  of  1980  census  population  figures.  The 
first  set  would  be  the  unadjusted  figures  used  for  reappor- 
tionment and  other  purposes  for  an  estimated  period  of  3  to 
4  years.  This  first  set  would  be  in  all  published  census  books 
for  all  census  geographic  areas,  would  be  on  all  census  tape 
files,  and  would  be  compatible  with  other  census  tabulations. 
The  second  set  would  be  published  early  in  1984  and  be  only 
for  regions,  States,  and  local  governmental  units.  A 
municipality  applying  for  a  HUD  neighborhood 
rehabilitation  grant  would  require  a  directive  from  HUD 
instructing  it  as  to  whether  the  difference  between  the 
adjusted  and  unadjusted  figures  at  the  city  level  should  be 
allocated  down  to  each  census  tract. 


Dr.  Keyfitz  summarizes  the  impact  of  this  situation  when 
he  says,  "Keeping  two  sets  of  books  is  as  confusing  for  a 
statistical  agency  as  for  a  business— there  is  perpetual 
uncertainty  about  which  set  to  use  for  each  application  that 
arises."  But  the  unavoidable  adverse  impact  of  "two  sets  of 
books"  will  unfortunately  not  stop  at  confusion.  Larger 
municipalities  can  logically  be  expected  to  ask  this  obvious 
question:  If  the  Census  Bureau  feels  that  it  can  estimate 
undercount  for  nearly  20,000  places  with  a  population  of 
under  1 ,000,  why  shouldn't  there  be  an  adjustment  for  every 
enumeration  district  or  block  group  in  each  city,  or  in  each 
county?. 

The  Irreversible  Commitment  of  Resources 

Adjustment  begets  more  adjustment.  The  National 
Academy  panel  points  out  that  the  distribution  of 
revenue-sharing  funds  would  be  substantially  more  affected 
by  adjusting  for  underreported  income  and  suggests  that  the 
Census  Bureau  work  toward  the  goal  of  adjusting 
characteristics  of  the  population  as  well  as  the  count.  The 
current  plans  for  demographic  analysis  and  the 
postenumeration  survey  will  cost  at  least  $20  to  $30  million; 
additional  adjustment  will  cost  many  more  millions.  As  more 
adjustment  is  called  for,  more  money  and  staff  time  at  the 
Census  Bureau  will  be  devoted  to  this  purpose  and  diverted 
from  the  task  at  hand— taking  the  census.  But  this  is  only  one 
irreversible  commitment  of  resources. 

There  is  another  commitment  of  resources,  the  total 
impact  of  which  is  more  difficult  to  assess  but  which  will 
certainly  occur.  This  is  the  time  and  money  that  will  be  spent 
in  lawsuits.  To  quote  Dr.  Keyfitz,  "The  courts  and  the  public 
will  be  treated  to  a  fireworks  display  of  statistical  and 
demographic  exposition,  with  a  generous  mixture  of  truth 
and  fallacy."  No  one  can  predict  the  outcome  of  these  suits, 
but  it  is  safe  to  say  that  much  professional  time  will  be 
diverted  from  more  productive  work.  The  Census  Bureau  has 
been  sued  by  a  group  that  is  against  counting  illegal  aliens  for 
reapportionment.  There  must  be  a  large  but  as  yet  unknown 
undercount  among  aliens.  How  this  group's  population 
should  be  adjusted  may  be  determined  more  by  legal 
precedent  than  sound  statistical  practice.  Fortunately,  that 
suit  was  decided  in  favor  of  the  Census  Bureau. 

Long-Term  Use  and  Productivity 

The  impact  on  long-term  use  and  productivity  will  be 
substantial.  The  major  impact  will  occur  because  attention 
will  be  diverted  from  obtaining  the  most  accurate  census 
possible  to  obtaining  the  most  advantageous  adjustment. 
Adjustment  for  the  undercount  cannot  be  hidden  from  the 
public,  from  census  enumerators,  from  local  public  officials, 
or  from  the  media.  The  total  cooperation  of  all  of  these 
people  is  essential  in  obtaining  the  best  possible  response 
rates.    Right   now   the    incentives   to   publicly   support   the 
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census  are  enormous.  But  once  it  has  become  known  that  the 
figures  will  be  "fudged"— and  that's  the  way  it  will  be 
reported  in  the  media— the  incentives  to  cooperate  will  be 
replaced  by  the  incentive  to  dispute  the  adjustment 
procedure. 

The  National  Academy  Panel  reported  on  the  subject  of 
recruitment  of  enumerators  that  "the  experiences  of 
temporary  workers  during  one  census  may  be  factors  that 
affect  their  willingness,  and  the  willingness  of  others,  to  work 
in  the  next  census."  Enumerators,  like  any  workers,  will 
perform  better  and  be  more  productive  when  they  believe 
that  their  work  is  important.  Shifting  the  focus  from 
obtaining  as  complete  a  census  as  possible  to  undercount 
adjustment  diminishes  the  importance  of  250,000  workers 
and  will  have  a  long-term  negative  impact  on  their 
productivity. 

There  will  also  be  a  negative  impact  on  the  productivity  of 
the  permanent  staff  at  the  Census  Bureau.  Using  an 
adjustment  procedure  that  is  known  in  advance  to  have  some 
"arbitrary  features"  is  to  invite  attacks  from  all  sides. 
Undercount  adjustment  has  been,  and  will  continue  to  be, 
viewed  as  the  most  vulnerable  part  of  the  Bureau  of  the 
Census.  Future  attempts  to  control  census  figures  will  start 
with  the  idea  of  having  some  "disinterested  third  party"  do 
the  undercount  adjustment.  The  amount  of  Census  Bureau 
staff  and  management  time  spent  on  defending  the 
adjustment  procedures  and  defending  itself  will  seriously 
impair  its  ability  to  carry  out  its  hundreds  of  other  statistical 
tasks. 


Mitigating  Measures 

There  are  some  mitigating  measures  that  might  be  taken. 
The  first  would  be  to  limit  the  production  of  undercount 
estimates  to  the  State  level.  This  would  have  to  be 
accompanied  by  an  unequivocal  statement  by  the  Director 
that  it  would  be  arbitrary  and  statistically  indefensible  to  go 
any  lower.  Given  these  estimates,  Congress  could  then  decide 
to  amend  existing  legislation  or  pass  new  legislation 
incorporating  an  adjustment  factor  for  each  State. 
Calculations  could  be  made  in  advance  how  much  a  State 
would  gain  or  lose  with  or  without  adjustment.  The  decision 
whether  or  not  to  incorporate  an  adjustment  becomes  a 
political  matter.  Certainly,  more  complicated  issues  have 
been  handled  by  the  political  process. 

The  estimates  of  differential  undercount  have  been 
limited  to  age,  race,  sex,  and  ethnic  origin.  By  continuing  to 
limit  the  estimates  to  the  short-form  population  items,  the 
Census  Bureau  will  avoid  the  statistical  quagmire  of  adjusting 
such  items  as  income  or  housing  value  for  sampling  error, 
underreporting,  nonresponse,  enumerator  bias,  etc.  Here 
again,  a  clear  statement  from  the  Census  Bureau  on  the 
arbitrary  nature  of  such  adjustments  would  do  much  to  end 
the  debate.  The  public  image  of  the  Census  Bureau  would 
not  be  enhanced  if  it  were  reported  that  after  the  census 
questionnaires  were  processed,  the  Bureau  would  examine 
income  tax  and  social  security  records  to  check  the  reported 
incomes.  Few  people  would  believe  that  was  a  one-way 
process. 


Impact  on  Business 

Initially,  the  impact  of  adjustment  on  the  business  use  of 
census  information  is  not  expected  to  be  very  great.  By  the 
spring  of  1983,  most  businesses  and  all  private  data 
companies  will  have  obtained  the  figures  they  need.  In  any 
case,  reference  would  be  made  to  the  printed  reports  or 
computer  tapes,  which  would  not  contain  adjusted  numbers. 
After  1983,  most  business  planners  will  probably  use  current 
estimates  or  projections  based  on  the  trends  in  unadjusted 
data.  There  may  be  some  confusion  in  the  marketplace  with 
two  sets  of  1980  population  figures,  but  most  market 
researchers  would  probably  just  ignore  the  adjusted  counts. 
In  any  case,  the  people  who  use  small-area  data  would  not 
have  any  choice  but  to  use  the  data  on  the  summary  tapes. 

The  long-term  impact  may  be  substantial.  If  the  quality  of 
the  census  enumeration  is  impaired  or  the  integrity  of  the 
published  figures  compromised,  business  planners  will  lose 
confidence  in  the  stacistics  end  look  for  an  alternative. 
Unfortunately,  there  really  is  no  satisfactory  alternative.  No 
one  else  produces  statistical  benchmarks.  As  the  uncertainty 
of  a  business  investment  increases,  so  does  the  aggregate  cost 
of  that  investment.  Part  of  controlling  inflation  consists  in 
lowering  the  cost  of  a  business,  not  raising  it. 


The  Alternative 

Is  there  an  alternative  to  undercount  adjustment?  Perhaps. 
The  most  important  reason  for  lack  of  a  100-percent 
enumeration  is  the  attitude  of  certain  segments  of  our 
population.  Feelings  of  hostility,  alienation,  and  a  desire  for 
personal  privacy  will  continue  to  exist.  These  can  only  be 
overcome  by  a  substantial  commitment  to  educate  the  public 
on  the  importance  of  the  decennial  census.  Part  of  this 
commitment  must  be,  of  course,  to  guard  against  trivial 
questions  getting  on  the  census  form.  Another  part  is  to 
continue  to  maintain  total  confidentiality.  But  a  large  part 
would  be  to  let  every  public  official  know  that  the  final 
census  count  would  be  the  last  word  for  funding  purposes. 
Much  more  attention  could  then  be  paid  to  creation  of 
address  coding  guides,  intercensal  estimates,  local  review,  and 
other  activities  that  aid  the  accuracy  of  the  census.  The 
census  is  currently  promoted  through  an  intensive  advertising 
campaign  that  begins  a  few  months  before  the  census  and 
ends  shortly  thereafter.  But  a  major  effort  at  public 
education  must  be  an  ongoing  program,  not  just  once  every 
10  years.  Last  year,  for  the  first  time,  a  section  on 
demographic  trends  appeared  in  the  President's  budget.  This 
is  recognition  of  the  importance  of  demographic  change  and 
a   beginning    in    the   educational   process.  The  independent 
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Population  Reference  Bureau,  with  a  very  small  staff  and 
limited  budget,  carries  on  an  effective  program  of  education 
about  population  matters  in  primary  and  secondary  schools. 
The  College  Curriculum  Support  Project  in  the  Census 
Bureau  provides  materials  for  college-level  courses.  Their 
work  is  a  useful  model  of  how  to  develop  public  awareness  of 
certain  issues. 

These  programs  are  at  present  severely  limited  in  terms  of 
staff  and  funding.  An  expansion  of  these  activities, 
supplemented  with  some  of  the  innovative  ideas  that  have 
come  from  the  new  Census  Promotion  Office,  could  have  a 
positive  impact  on  public  attitudes  for  every  census  activity. 
The  alternative  to  statistical  adjustment  is  to  change  attitudes 


about  our  Government's  need  for  information  and  about  the 
decennial  census.  It  is  important  that  we  resist  the 
temptation  to  adjust  census  figures  in  the  endless  search  for 
perfect  fiscal  equity.  Because,  ultimately,  we  will  have 
neither  equity  nor  a  decent  statistical  environment. 
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INTRODUCTION 

In  the  papers  before  us,  Francese  has  argued  that  a  viable 
alternative  to  correcting  the  1980  census  for  a  likely  under- 
count  would  be  ".  .  .to  change  attitudes  about  our  Govern- 
ment's need  for  information  and  about  the  decennial  census." 
This  educational  approach  to  correcting  errors  is  justified  in 
his  view  because  neither  fiscal  equity1  nor  a  decent  statistical 
environment  will  result  from  undercount  corrections.  Keyfitz 
is  somewhat  more  sympathetic  to  mak:ng  corrections,  lays 
out  for  us  some  of  the  options  available  for  correcting  the 
undercount,  and  concludes  that  if  any  correction  is  to  be 
made,  it  should  be  a  very  simple  one.  He  suggests,  for  ex- 
ample, that  if  a  6-percent  error  rate  for  blacks  is  found,  then 
every  black  should  be  multiplied  by  1.06. 

I  find  both  the  position  of  do-nothing  for  fear  of  the  risks, 
or  do  something  simple  on  the  grounds  of  general  intelligi- 
bility to  be  unacceptable  postures  for  the  Federal  Govern- 
ment to  take.  Keyfitz's  approach  (a  sort  of  "rough  justice") 
suffers  from  not  taking  advantage  of  all  available  scientific 
evidence  that  is  likely  to  be  available.  As  such,  it  might  well 
be  found  to  be  arbitrary  and  capricious,  were  it  litigated, 


which  in  turn  could  block  it  ever  being  actually  implemented. 
Also,  on  scientific  grounds,  I  find  not  using  all  available  infor- 
mation objectionable.  With  respect  to  Francese's  view,  I  find 
that  the  equities  to  be  sacrificed  for  the  reasons  provided  to 
be  far  too  costly  in  comparison  to  the  gains  from  inaction. 
My  perspective  on  the  1980  census  undercount  is  the 
same  as  that  held  on  the  1970  undercount:  It  should  be 
corrected  and  reflected  in  official  Census  Bureau  statistics  in 
as  much  detail  as  analysis  can  provide  (race,  sex,  age,  income, 
geographic  area,  etc.).2  In  my  remarks  below,  I  elaborate 
this  view  by  examining  in  more  detail  the  equities  at  risk 
and  by  discussing  some  of  the  administrative  arguments 
made  against  correcting  for  the  undercount. 

THE  EQUITIES  AT  RISK  IN  FAILING  TO 
CORRECT  FOR  THE  UNDERCOUNT 

Much  of  the  justification  for  acting  to  correct  the  likely 
undercount  in  the  1980  census  is  based  on  the  inequities 
that  will  result  from  subsequent  use  of  the  data  in  Federal 
grants-in-aid  formulas.  I  think  that  this  justification  can  be 
sharpened    considerably   by   examining  the  equity  issue  in 


1  Francese  notes  that  for  fiscal  equity  purposes— the  use  of  Census 
data  for  allocating  Federal  revenue  sharing  funds  to  local  govern- 
ments—the correction  of  the  population  undercount  is  not  as  im- 
portant as  the  correction  of  the  income  data.  He  then  cites  a  National 
Academy  (1978)  study  in  support  of  this.  If  we  view  "importance" 
on  an  analytical  or  logical  basis,  this  is  not  correct,  for  it  can  readily 
be  shown  that  the  basic  intrastate  formula  can  be  rewritten  as: 
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where, 

%.  is  the  percentage  share  of  a  fixed  dollar  amount  for  the  /th 

government 
P.  is  the  population  of  the  /th  government 
T.  is  the  adjusted  taxes  of  the  /th  government 
Y.  is  the  total  census  money  income  of  the  /th  government 

Clearly,   P   and    Y  enter  the  above  statement   in   symmetric,  albeit 
opposite  directions. 

Robinson  and  Siegel  report  empirical  results  for  correcting  P 
and  Y  and  noting  changes  in  revenue  sharing  payments,  but  for 
different  problems.  Their  correction  of  P  was  for  undercounts,  while 
their  correction  of  Y  in  their  experiments  was  for  underreporting  of 
income  by  those  who  were  initially  counted.  It  is  not  clear  that  their 
results  relate  entirely  to  the  question  of  whether  corrections  to  P 
or   Y  are  empirically  more  important  in  the  formula,  since  they  did 


not  attempt  to  impute  income  to  the  undercounted,  which  would 
be  the  other  component  of  measurement  error  in  the  Census  Bureau 
money  income  concept.  Also,  correcting  Y  for  possible  underreport- 
ing of  income  through  the  use  of  BEA  personal  income  data  is  an  in- 
direct method  as  best,  since  the  BEA  income  concept'  is  far  more 
inclusive.  Their  corrections  and  therefore  empirical  results  reflect 
both  the  correction  for  underreporting  and  the  use  of  a  broader 
income  concept.  Of  course,  if  t\P  =  AY  resulting  from  the  corrections, 
then  %.  will  be  unchanged. 

I  might  also  note  their  correction  of  P  in  their  simulations  under- 
states the  effect  of  a  correction  in  P,  as  they  view  (1 )  as: 
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Since   it   is  well    known   that  PCI   is  a   derived  figure  from  the  raw 
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Census  data,  PCI.  =1  —      ,  they  in  effect  fail  to  correct  per  capita 
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income's  denominator  by  only  adjusting  P.  in  (2). 

2  See  Robert  P.  Strauss  and  Peter  B.'hlarkins  (1974),  The  1970 
Census  and  Revenue  Sharing:  Effects  on  Allocations  in  New  Jersey 
and  Virginia,  (Washington,  D.C.:  Joint  Center  for  Political  Studies), 
or  "The  Impact  of  Population  Undercounts  on  General  Revenue 
Sharing  Allocations  in  New  Jersey  and  Virginia,"  National  Tax 
Journal,  Vol.  XXVII,  No.  4  (December  1974). 
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terms  of  who  benefits  from  the  grants,  the  distinction  be- 
tween horizontal  and  vertical  equity  as  usually  made  in  the 
public  finance  literature,  and  by  examining  these  equity 
issues  they  impact  over  time,  e.g.,  the  next  10  years. 

I  take  it  to  be  axiomatic  that  correction  of  theundercount 
in  the  1980  census  constitutes  a  "benefit."  The  analytical 
problem  entails  measurement  of  benefits  against  the  costs. 

It  is  sometimes  argued  that  in  the  case  of  fixed  amount 
grants,  there  are  going  to  be  "winners"  and  "losers."  As  a 
result  of  the  change  in  allocation  due  to  a  correction,  some 
are  worse  off  and  some  are  better  off.  It  then  might  follow 
that  the  two  need  to  be  balanced  before  making  the  correc- 
tion. This  is  not  the  place  to  get  into  an  extended  discussion 
of  whether  Pareto  redistribution  rules  are  appropriate  cri- 
teria, but  I  would  argue  that  most  theoretical  social  welfare 
analysis  treats  positive  and  negative  deviations  from  the 
"true"  or  desired  outcome— in  this  case,  more  accurate  state- 
ments of  population,  income,  etc.— on  an  equal  footing. 
Put  another  way,  the  loss  function  is  in  absolute  difference 
terms,  not  the  sum  of  pluses  and  minuses.  Since  there  is  a 
fixed  sum  to  be  allocated,  the  total  amount  of  gain  and  loss 
must  necessarily  be  equal,  so  arguing  for  a  winner/loser 
count  as  a  criteria  would  always  amount  to  a  decision  to 
take  no  action. 

Equity  and  the  Unit  of  Analysis 

Because  general  revenue  sharing  (GRS)  is  so  large  in 
dollar  value  (roughly  $6.8  billion/year)  and  so  ubiquitous 
in  terms  of  allocation  to  general  government  (roughly  39,000 
jurisdictions),  it  has  been  taken  for  granted  in  most  discus- 
sions of  improving  the  data  on  these  jurisdictions  that  the 
improvement  in  data  will  benefit  the  governments.  Because 
the  governments  literally  receive  Federal  checks  from  GRS 
and  other  grant-in-aid  schemes  (including,  of  course,  State 
grant-in-aid  schemes),  it  is  commonplace  to  assume  that  the 
State  and  local  governments  are  the  beneficiaries  of  improved 
population  figures.  By  contrast,  I  would  argue  that  the 
grants  are  there  for  the  provision  of  services  by  the  govern- 
ments to  individuals,  and  the  beneficiaries  of  improved  data 
are  individuals.  When  we  talk  about  more  accurately  assuring 
individuals  of  their  proper  amount  of  Federal  aid,  rather 
than  governments,  I  believe  the  nature  of  the  concern  changes 
and  becomes  more  compelling.  That  is,  when  we  talk  in 
some  sense  about  "fairness,"  which  is  the  usual  synonym 
for  equity,  I  think  the  debate  about  whether  or  not  to 
correct  for  the  undercount  becomes  more  compelling  when 
we  address  initially  the  impact  of  the  correction  on  indi- 
viduals, rather  than  on  governments. 

The  Horizontal/Vertical  Equity  Distinction  and 
the  Case  for  Correcting  Undercount 

It  is  common  to  distinguish  between  the  treatment  of 
individuals    in   the   same   economic   situation— this    involves 


matters  of  horizontal  equity— and  the  treatment  of  individ- 
uals in  different  economic  situations— this  involves  matters  of 
vertical  equity.  In  taxation,  the  principles  are  that,  in  the 
horizontal  case,  the  individuals  should  pay  the  same  taxes, 
both  in  amount  and  of  course  in  rate.  In  the  vertical  case, 
it  is  generally  accepted  that  the  tax  system  should  require 
equal  sacrifice  among  individuals  of  differing  abilities  to 
pay,  and  that  under  most  interpretations  of  the  varying 
utility  of  income,  this  will  involve  a  progressive  tax  system 
in  which  the  tax  rate  on  low-income  persons  is  smaller  than 
the  rate  on  high-income  persons. 

Now,  I  would  judge  much  of  the  discussion  over  the 
"equities"  at  risk  in  the  two  papers  to  tacitly  assume  that 
horizontal  equity  is  at  risk,  ignoring  the  issue  of  vertical 
equity.  This  is  especially  the  case,  since  the  unit  of  analysis 
is  assumed  to  be  the  governmental  unit.  The  sort  of  argu- 
ment I  hear  in  these  and  other  papers  seems  to  be  that  if 
we  can  get  the  populations  right,  then  the  governments  will 
be  put  on  an  equal  footing.  If  the  numbers  were  not  to 
change  much  (1  to  3  percent),  it  could  be  argued  (I  think 
this  underlies  Keyfitz's  simplified  approach)  that  they  are 
already  on  a  nearly  equal  footing  -therefore,  why  bother? 
When  one  views  the  beneficiary  of  Federal  aid  to  be  indi- 
viduals, however,  I  believe  the  problem  of  horizontal  equity 
looms  larger,  and  one  becomes  more  reluctant  to  ignore 
3-percent  errors. 

If  one  views  vertical  equity  to  be  at  risk  initially,  however, 
I  think  again  the  argument  for  making  the  corrections  is 
more  compelling.  For  example,  I  think  the  motivation  for 
general  revenue  sharing  and  most  other  grant-in-aid  pro- 
grams is  essentially  redistributive  in  nature.  That  is,  the 
purpose  of  most  if  not  all  Federal  grants  is  to  achieve  a 
defined  vertical  equity  through  the  redistribution  of  Federal 
funds  to  individuals.  Viewed  in  vertical  terms,  the  failure 
to  correct  for  the  undercount  may  be  viewed  to  be  a  failure 
of  Federal  programs  to  redistribute  to  less  well-off  indi- 
viduals from  better-off  individuals.  Because  the  less  well-off 
tend  to  be  black,  young,  and  male,  the  loss  in  vertical  equity 
of  not  correcting  for  their  being  undercounted  is  substantial. 
It  is  precisely  because  many  Federal  programs  attempt  to 
assist  these  individuals  that  their  undercount  is  so  egregious 
and  demands  correction. 

Equities  at  Risk  Over  Time 

In  computing  costs  and  benefits,  it  should  be  remembered 
that  the  benefits  accrue  over  time,  and  unless  we  have  a  mid- 
decade  census,  the  benefits  will  accrue  through  1990.  In 
deciding  whether  or  not  to  correct  for  the  undercount,  these 
benefits  and  any  associated  costs  should  be  discounted  back 
to  the  present,  using  some  appropriate  discount  rate. 

To  get  some  rough  magnitude  of  the  equity  risk,  let  us 
assume  that  Federal  grants  are  to  rise  at  10  percent  per  year 
over  the  next  10  years,  that  a  10-percent  discount  rate  is 
appropriate,  and  that  there  are  $50  billion  per  year  at  issue 
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with  a  possible  error  of  1 -percent  undercount  rate.  Back  of 
the  envelope  calculations  then  suggest  that  $50  million  per 
year  is  at  risk,  or  $500  million  for  the  entire  period.  If  we 
are  willing  to  weight  the  value  of  not  getting  funds  properly 
to  the  poor  more  heavily  than  erroneously  giving  funds  to 
the  rich,  e.g.  weight  the  absolute  values  of  the  errors,  then 
the  benefit  will  be  larger. 

It  is  difficult  for  me  to  envision  that  a  loss  of  this  order 
of  magnitude  would  not  warrant  concerted  action  by  the 
Bureau  as  well  as  the  Office  of  Statistical  Policy  and  Stand- 
ards to  correct  for  the  undercount.  To  be  sure,  these  are 
rough  assumptions,  but  indicative  of  the  equities  at  risk  over 
time.  It  will  be  interesting  to  see  if  the  benefit  of  this  magni- 
tude is  matched  by  a  corresponding  investment  cost  to 
correct  for  undercount. 

ADMINISTRATIVE  AGGRAVATION  AS  A 

REASON  FOR  NOT  CORRECTING  THE 

UNDERCOUNT 

Both  papers  indicate  that  the  Bureau,  were  it  to  correct 
for  the  undercount,  would  have  to  keep,  in  effect,  two  sets 
of  books.  Because  the  correction  might  come  substantially 
after  the  initial  enumeration,  it  is  argued  that  the  Census 
would  have  to  publish  another  complete  set  of  documents. 
This  would  be  somewhat  expensive  (paper,  personnel,  and 
computer  costs),  open  the  Bureau  to  possible  litigation  and 
congressional  intervention,  and  be  generally  untidy. 

As  an  economist,  indeed  even  as  an  academic  economist, 
I  must  say  that  I  find  the  conflict  surrounding  the  under- 
count issue  to  be  a  sign  of  health  and  vitality  in  our  Federal 
statistical  system.  It  is  understandable  that  an  agency  pro- 
ducing a  product,  the  Bureau  and  its  numbers,  does  not  like 
to  have  the  product's  quality  questioned  nor  be  told  how  to 
produce  its  numbers.  Fortunately,  even  though  the  Bureau 
is  a  public  monopoly,  it  has  participated  in  the  analysis  of 
its  strengths  and  weaknesses.  Economic  theory  generally 
predicts  that  competition,  which  is  often  accompanied  by 
conflict  and  in  this  case  administrative  aggravation,  fre- 
quently improves  the  quality.  Thus,  I  view  the  aggravation 
that  might  result  from  the  correction  of  the  undercount  to 
be  a  benefit  rather  than  a  cost. 

This  conflict  over  the  undercount  that  has  occurred  over 
the  last  few  years  was  certainly  predictable  in  view  of  the 
change  in  the  way  the  Federal  Government  has  been  making 
grant-in-aid  payments.  The  movement  from  discretionary 
awards  to  formula-based  grants  ensured  that  there  would  be 
more  scrutiny  of  the  underlying  data  base,  and  as  a  result, 
more  pressure  for  correction  of  known  errors.  With  respect 
to  the  future,  I  think  it  is  quite  clear  that  as  a  result  of  the 
slowdown  in  the  long-term  growth  path  of  the  economy, 
concern  over  the  size  of  the  public  sector,  and  the  desire  to 
balance  the  budget.  Federal  aid  will  grow  more  slowly  than 
in  the  past,  perhaps  less  than  the  10  percent  used  above.  The 
implication  of  this  for  the  aggravation  content  of  our  Federal 


statistical  data  base  is  that  it  will  increase,  as  people  compete 
for  scarce  dollars  by  arguing  that  the  data  need  adjustment. 
I  view  this  conflict,  however,  to  be  inherently  helpful  to 
ensuring  that  we  have  the  best  statistics  in  the  world.  If  there 
is  no  constant  scrutiny  and  review  of  our  statistical  ma- 
chinery, especially  in  view  of  the  centralization  in  Federal 
statistics  promoted  by  this  Administration,  I  am  concerned 
that  an  inferior  product  will  result. 

With  respect  to  the  confusion  that  is  said  to  possibly  re- 
sult from  the  Bureau  keeping  two  sets  of  books,  I  think 
several  observations  are  in  order.  First,  there  is  precedent  for 
correcting  data  from  a  census  when  it  is  put  in  machine 
readable  form.  The  per  capita  incomes  that  were  printed  in 
the  1970  Fourth-Count  volumes  and  those  made  available 
on  tape  much  later  were  not  identical.  More  convincing, 
however,  is  the  point  that  Federal  statistics  are  generally 
updated,  revised,  rebenchmarked,  and  corrected.  For  ex- 
ample, our  GNP  series,  which  is  used  for  a  wide  variety  of 
public  policy  purposes,  goes  through  several  corrections, 
and  the  user  community  has  not  been  perplexed  by  this. 
Also,  there  is  ample  precedent  in  the  general  revenue  shar- 
ing program  for  correcting  data;  Treasury  has  for  many 
years  been  giving  governmental  recipients  the  adjusted  tax 
data,  and  asked  them  to  approve  it  or  complain  about  it. 
Admittedly,  it  is  more  difficult  for  the  population  and 
income  data  to  be  handled  on  that  basis,  but  the  contrast 
between  the  Treasury's  inviting  criticism  and  the  Bureau's 
reluctance  to   correct  for  known  errors  is  surely  striking. 

CONCLUSION 

The  chief  justification  in  my  mind  for  correcting  the  1980 
census  for  the  likely  undercount  involves  the  equities  that 
will  be  at  risk  if  nothing  is  done.  Both  papers  acknowledge 
that,  because  the  results  of  the  census  will  be  widely  used, 
there  are  grounds  for  making  the  corrections.  However, 
neither  finds  the  equities  at  risk  to  be  so  compelling  that  a 
complete  correction  be  made  for  the  known  undercount  by 
age,  race,  sex,  income,  etc.  In  my  review  of  the  equities  at 
risk,  I  have  tried  to  demonstrate  that  the  most  significant 
equities  at  risk  involve  the  income  redistribution  efforts  of 
the  Federal  Governments  vis-a-vis  individuals,  not  govern- 
ments as  is  frequently  suggested.  When  we  focus  on  the 
vertical  equities  of  individuals  that  will  be  at  risk  if  no  cor- 
rection is  made,  especially  in  light  of  the  amounts  of  money 
and  the  extended  time  period  involved,  the  concerns  over 
administrative  inconvenience  that  such  a  correction  might 
entail  appear  to  be  minor.  Finally,  there  is  ample  precedent 
in  a  wide  variety  of  other  Federal  statistics  for  the  correction 
of  known  data  errors. 

In  sum,  I  think  the  only  responsible  stance  that  can  be 
taken  vis-a-vis  the  undercount  is  that  the  1980  census  be 
speedily  corrected  for  it. To  do  less  would  violate  the  public 
trust  the  Bureau  enjoys  to  create  the  most  accurate  data 
possible. 


FloorDiscussion 


The  question  was  raised  as  to  how  a  statistical  agency 
attempts  to  implement  the  intent  of  Congress.  It  was  ob- 
served that  Congress  is  not  reapportioned  by  a  complex  set 
of  calculations  such  as  the  gross  national  product.  However 
simplistic  a  census  adjustment  might  be,  it  still  would  re- 
quire two  "sets  of  books,"  and  this  would  be  a  mistake. 
It  was  also  stressed,  however,  that  opening  the  door  to  ad- 
justment does  not  necessarily  lead  to  increased  litigation; 
the  courts'  contribution— through  relying  on  experts— would 
be  slight.  An  immediate  crude  adjustment  that  can  be  refined 
later  as  the  data  become  available  was  advocated.  For  ex- 
ample, each  black  might  count  for  1.06  persons  in  revenue- 
sharing  calculations.  This  would  avoid  two  sets  of  figures 
and  could  even  be  legislated.  Even  in  the  absence  of  new 
legislation,  if  the  relative  underenumeration  of  a  group  is 
known,  a  factor  could  be  applied  to  increase  the  amount  of 
funding  at,  say,  the  municipal  level.  This  assumes  that  the 
true  population  is  not  known.  What  is  being  sought  is  a  way 
to  prevent  the  intrinsic  error  from  affecting  equity  for  major 
groups.  Although  it  was  feared  by  others  that  an  arbitrary 
1.06  adjustment  would  not  stand  up  in  court,  as  it  would 
not  take  advantage  of  all  of  the  information  available,  a 
simple  solution  such  as  this  would  forestall  a  "free-for-all" 
in  the  construction  based  on  the  census  by  local  officials. 

It  was  noted,  however,  that  the  largest  redistribution 
program  depends  more  on  income  than  on  population, 
so  that  the  effect  will  be  negligible  if  an  adjustment  is  made 
for  population  alone.  Nonetheless,  there  was  little  discus- 
sion of  a  census  adjustment  for  income. 

It  was  noted  that  adjusted  figures  have  not  been  published 
for  previous  censuses,  and  it  was  presumed  that  unadjusted 
figures  will  be  published  for  1980.  It  was  suggested  that  the 
Bureau  should  also  publish,  however,  some  adjusted  figures 
that  are  arrived  at  by  a  convention  determined  by  a  desig- 
nated "censor"  and  the  decisionmaking  bodies.  The  two  sets 
of  figures  will  differ,  but  this  situation  will  be  an  incentive 
to  reduce  the  differential  in  1990.  Further,  the  decision- 
making bodies  will  have  more  liberty  in  choosing  the  figures 
to  be  used.  There  will  be  some  public  confusion,  but  this 
should  be  reduced  with  time  and  use  of  the  data.  Other 
participants  reacted  that  two  sets  of  census  figures  should 
never  be  published,  however.  They  argued  that  nothing  other 
than  an  official  count  should  be  issued.  The  Bureau  might 
make  available  estimates  of  the  undercount,  with  which  the 
user  could  do  his  own  arithmetic  to  arrive  at  adjusted  figures. 
In  this  way  the  population  count  would  not  be  tested. 

The  group  returned  to  a  discussion  of  the  advantages  of  a 


simple  adjustment  procedure.  It  was  questioned  that  a 
simple  method  would  rid  the  census  of  gross  errors  when 
some  tests  of  more  elaborate  techniques  correlate  poorly 
with  the  simple  estimates  being  proposed.  Some  findings 
also  suggest  that  adjustment  for  age  and  sex  is  unimportant; 
the  key  item  is  race.  Others  indicate  that  geographic  location 
may  be  most  important;  e.g.,  central  city  versus  suburb. 
Further,  the  National  Academy  of  Sciences  found  that  the 
accuracy  of  synthetic  estimation  is  not  great,  and  its  feasi- 
bility is  questionable.  The  main  point  is  to  choose  an  ac- 
ceptable convention.  If  the  postenumeration  survey  (PES) 
gives  the  equivalent  of  a  complete  count,  then  regressions 
can  be  made  on  the  PES,  but  there  should  not  be  a  large 
number  of  adjustments  to  attain  minor  gains  in  accuracy. 
In  the  past,  PES's  have  been  incomplete,  and  simply  pro- 
vided another  variant.  It  may  take  careful  demographic 
analysis  to  arrive  at  gross  estimates  for  race  at  the  State 
level,  so  there  is  no  "simple"  method  to  obtain  gross 
effects. 

It  was  also  thought  to  be  doubtful  that  there  is  a  single 
"best"  adjustment.  The  pressures  to  produce  one  were 
questioned  and  some  participants  felt  that  it  would  be  better 
to  attempt  several,  such  as  in  the  Bureau's  projections 
where  "A,"  "B,"  and  "C"  series  are  produced. 

The  Census  Bureau  indicated  that  it  is  working  with  a 
modification  of  a  simple  synthetic  method  using  race  (and 
age  and  sex,  if  desired)  to  take  into  account  variations  in 
income,  and  this  caused  large  modifications  in  the  alloca- 
tions. Demographic  estimates  of  error  do  not  correlate  very 
highly  with  simple  synthetic  procedures.  If  they  are  con- 
verted to  revenue-sharing  allocations,  they  will  differ  radically 
from  the  simple  synthetic  estimates.  Revenue  sharing  illus- 
trates the  effects  of  changing  coverage  errors.  The  PES 
presents  serious  problems  with  regard  to  establishing  the 
level  of  coverage  for  individual  States,  but  the  PES  results 
can  be  merged  with  the  demographic  estimates  in  a  weighting 
procedure.  A  moderate  approach  was  suggested,  however. 
Since  some  of  the  more  complex  methods  for  adjustment 
still  need  to  be  worked  out,  the  Bureau  was  cautioned 
against  going  too  far  too  fast.  It  was  argued  that  matching 
a  PES  with  address  registers  gives  uncertain  results. 

It  was  emphasized  also  that  even  simple  adjustment  de- 
pends on  technological  capabilities.  The  models  for  adjust- 
ment are  not  simple,  and  there  are  no  unique  choices,  so 
much  testing  is  needed,  and  there  is  no  time  for  that  for 
1980.  Caution  was  recommended  for  the  immediate  future, 
but  for  the  long  run,  active  research  was  suggested  as  the 
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only  reasonable  course.  It  may,  in  fact,  not  be  feasible  to 
adjust  under  present  conditions. 

The  importance  of  the  census  figures  in  Congress  was 
stressed,  however.  Litigation  is  inevitable  should  nothing 
be  done  in  the  way  of  adjustment.  The  problem  will  not 
"go  away."  It  was  argued  that  attempting  to  ignore  the 
problem  is  a  dangerous  position.  The  only  protection  against 
litigation  is  to  say  that  this  is  the  best  that  can  be  done  at 
this  point.  That  "best"  must  not  be  "arbitrary  and  capri- 
cious," but  the  situation  cannot  wait  for  extended  research 
to  identify  a  generally  recognized  best  method.  It  was  argued 
that  the  impact  of  the  undercount  is  too  important  not  to 
adjust,  and  that  even  a  simple  procedure  might  leave  every- 
one unhappy  but  quiet.  There  was  some  concern  with 
equity  in  a  simple  solution  but  that  equity  decisions  should 
be  left  to  the  Congress,  which  should  also  specify  the 
methodology  and  write  the  nature  of  the  adjustment  into  its 
formulas.  Congress  might  take  a  simple  approach  with 
technical  advice  from  the  Census  Bureau. 

Although  the  Congress  might  act  in  several  ways,  one 
could  be  to  have  the  Census  Bureau  publish  the  actual  count, 
followed  by  estimates  of  the  underenumeration.  Congress 
would  have  the  option  of  how  these  figures  could  be  used 
in  other  legislation.  This  would  follow  the  Canadian  ex- 
perience, where  it  was  clear  that  the  Parliament  did  not 
know  what  to  do;  the  statistical  agency  is  best  qualified  in 
this  area.  The  Census  Bureau  probably  will  not  be  allowed 
to  do  nothing,  but  I  cannot  simply  present  its  research 
results;  it  must  offer  reasonable  minimal  action  for  getting 
closer  to  equity. 

The  Census  Bureau  emphasized  the  distinction  between 
counts  and  estimates  and  stressed  the  need  to  label  them 
accordingly.  A  simple  estimate  is  justified  only  if  it  is  also 


the  best  one  possible.  The  complicated  nature  of  the  esti- 
mation process  also  was  underscored;  matching  presents 
difficulties,  and  so  does  any  other  method.  The  synthetic 
method  assumes  some  factors  and  ignores  others  (such  as 
the  lack  of  data  on  Hispanics).  Urban/rural  differences  are 
not  small;  they  are  curvilinear.  It  is  only  a  guess  that  the 
blacks  that  were  within-household  misses  may  be  concen- 
trated in  large  cities. 

Attention  also  was  called  to  the  resources  affected  by 
adjustment;  the  $50  billion  in  Federal  funds  allocated  each 
year  are  only  part  of  the  total  benefit.  There  are  also  State 
and  local  distributions,  and  still  more  in  the  private  sector; 
for  example,  a  decision  about  locating  a  plant  affects  job 
opportunities.  There  is  a  need  for  more  accurate  data  for 
private  decisionmaking.  It  was  felt  that  survey  sample  weights 
based  on  the  census  could  affect  minorities  adversely  for 
some  time. 

The  American  Demographics  article  'The  Statistical 
Nightmare"  (Vol.  1,  February  1980,  pp.  18-23)  was  suggested 
as  containing  a  possible  compromise  on  the  issues  of  com- 
plexity in  the  adjustment  and  uses  of  the  adjusted  figures. 
It  was  proposed  in  the  article  that  the  unadjusted  figures  be 
used  for  apportionment,  but  that  the  adjusted  figures  should 
be  relied  upon  in  the  allocation  of  funds.  If  only  adjusted 
figures  were  available,  the  Bureau  would  be  forced  to  play 
an  unintended  role,  but  equity  in  fund  allocation  demands 
some  adjustment  comparable  to  using  estimates  between 
censuses.  Legislative  bodies  could  use  such  estimates,  which 
the  Bureau  could  publish  officially  as  one  component  in  its 
estimates  and  projections  series.  Geographic  estimates 
present  technical  but  not  political  problems,  whereas  choos- 
ing the  best  distribution  of  funds,  for  example,  is  a  political 
matter,  and  the  lawmakers  can  make  their  own  choices. 


The  Congressional  Perspective 


Daniel  P.  Moynihan 

United  States  Senate 


It  is  an  honor  to  address  this  distinguished  gathering  on 
the  urgent  and  important  subject  of  the  "census  under- 
count."  I  would  begin  by  recalling  a  similar  event  held  here 
in  Washington  13  years  ago,  under  the  joint  auspices  of  the 
Bureau  of  the  Census  and  the  Harvard-M.I.T.  Joint  Center 
for  Urban  Studies,  of  which  I  was  then  Director.  That  con- 
ference was,  to  my  knowledge,  the  first  ever  held  on  the 
problem  of  the  "undercount,"  and  I  sketched  its  origins 
and  purposes  in  a  foreword  to  the  conference's  proceedings 
as  published  by  the  Joint  Center  in  1968: 

At  one  point  in  the  course  of  the  1950's  John  Kenneth 
Galbraith  observed  that  it  is  the  statisticians,  as  much  as 
any  single  group,  who  shape  public  policy,  for  the  simple 
reason  that  societies  never  really  become  effectively  con- 
cerned with  social  problems  until  they  learn  to  measure 
them.  An  unassuming  truth,  perhaps,  but  a  mighty  one, 
and  one  that  did  more  than  he  may  know  to  sustain 
morale  in  a  number  of  Washington  bureaucracies  (hateful 
word!)  during  a  period  when  the  relevant  cabinet  officers 
had  on  their  own  reached  very  much  the  same  conclusion— 
and  distrusted  their  charges  all  the  more  in  consequence. 
For  it  is  one  of  the  ironies  of  American  government  that 
individuals  and  groups  that  have  been  most  resistant  to 
liberal  social  change  have  quite  accurately  perceived  that 
social  statistics  are  all  too  readily  transformed  into  poli- 
tical dynamite,  whilst  in  a  curious  way  the  reform  tem- 
perament has  tended  to  view  the  whole  statistical  process 
as  plodding,  overcautious,  and  somehow  a  brake  on 
progress.  (Why  must  every  statistic  be  accompanied  by 
detailed  notes  about  the  size  of  the  "standard  error"?) 

The  answer,  of  course,  is  that  this  is  what  must  be  done 
if  the  fact  is  to  be  accurately  stated,  and  ultimately  ac- 
cepted. But,  given  this  atmosphere  of  suspicion  on  the 
one  hand  and  impatience  on  the  other,  it  is  something  of 
a  wonder  that  the  statistical  officers  of  the  Federal 
Government  have  with  such  fortitude  and  fairness  re- 
mained faithful  to  a  high  intellectual  calling,  and  an  even 
more  demanding  public  trust.  _ 

There  is  no  agency  of  which  this  is  more  true  than 
the  Bureau  of  the  Census,  the  first,  and  still  the  most  I 
important,  information-gathering  agency  of  the  Federal 
Government.  For  getting  on,  now,  for  two  centuries,  the 
Census  has  collected  and  compiled  the  essential  facts  of 
the  American  experience.  Of  late  the  ten-year  cycle  has  ! 
begun  to  modulate  somewhat,  and  as  more  and  more 
current  reports  have  been  forthcoming,  the  Census  has 
been  quietly  transforming  itself  into  a  continuously  flow- 
ing source  of  information  about  the  American  people.  In, 
turn,  American  society  has  become  more  and  more 
dependent  on  it.  It  would  be  difficult  to  find  an  aspect 
of  public  or  private  life  not  touched  and  somehow  shaped 
by  Census  information.  And  yet  for  all  this,  it  is  somehow 
ignored.  To  declare  that  the  Census  is  without  friends 
would  be  absurd.  But  partisans?  When  Census  appropri- 


ations are  cut,  who  bleeds  on  Capitol  Hill  or  in  the  Execu- 
tive Office  of  the  President?  The  answer  is  almost  every- 
one in  general,  and  therefore  no  one  in  particular.  But  the 
result,  too  often,  is  the  neglect,  even  the  abuse,  of  an 
indispensable  public  institution,  which  often  of  late  has 
served  better  than  it  has  been  served. 

The  "avowed  purpose"  of  the  1967  conference  was  that 
"of  arousing  a  measure  of  public  concern  about  the  difficul- 
ties encountered  by  the  census  in  obtaining  a  full  count 
of  the  urban  poor,  especially  perhaps  the  Negro  poor"— we 
would  now  say  black. 

Our  impetus,  in  short,  was  the  "undercount,"  to  use 
today's  word,  that  had  occurred  in  the  1960  decennial 
census.  "It  was  hoped,"  I  wrote,  "that  a  public  airing  of  the 
issue  might  lead  to  greater  public  support  to  ensure  that  the 
census  would  have  the  resources  in  1970  to  do  what  is,  after 
all,  its  fundamental  job,  that  of  counting  all  the  American 
people.  .  .  (T)he  full  enumeration  of  the  American  popula- 
tion is  not  simply  an  optional  public  service  provided  by  the 
Government  for  the  use  of  sales  managers,  sociologists,  and 
regional  planners.  It  is,  rather,  the  constitutionally  mandated 
process  whereby  political  representation  in  the  Congress  is 
distributed  as  between  different  areas  of  the  Nation.  It  is  a 
matter  not  of  convenience  but  of  the  highest  seriousness, 
affecting  the  very  foundations  of  sovereignty.  That  being  the 
case,  there  is  no  lawful  course  but  to  provide  the  Bureau  with 
whatever  resources  necessary  to  obtain  a  full  enumeration." 

Our  focus,  clearly,  was  on  obtaining  a  complete  enumer- 
ation, and  it  is  a  fact  that  the  Census  Bureau  made  a  valiant 
effort  to  do  just  that  in  the  1970  census.  But,  it  is  also  a 
fact  that  it  failed.  This  was  not  an  abject  failure,  however. 
There  is  some  evidence  that  the  1970  undercount  was  less 
severe  than  10  years  earlier.  And  there  is  ample  evidence— not 
least  the  convening  of  today's  conference  and  the  commis- 
sioning of  the  studies  and  papers  that  preceeded  it— that  the 
Census  Bureau  has  made  a  forthright  and  conscientious 
effort  both  to  estimate  the  size  of  the  1970  undercount  and 
to   develop  a    range   of   possible   remedies   for  the  future. 

The  future  has  now  arrived.  The  1980  census  is  just 
weeks  away.  A  monumental  attempt  has  been  made— is 
being  made— to  obtain  as  complete  an  enumeration  of  the 
population  as  possible.  Yet  we  know  that  there  will  again 
be  an  undercount.  And  we  are  here  today  to  discuss  a  dif- 
ferent issue  than  that  which  absorbed  our  attentions  13 
years  ago.  Given  that  the  problem  known  as  the  "under- 
count" seems  destined  to  persist  in  the  actual  enumeration 
of  the  population,  and  given  our  ability  to  estimate  with 
reasonable  accuracy  how  large  it  is,  should  adjustments  be 
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made  on  the  basis  of  statistical  estimates  of  the  undercount? 
In  other  words,  if  we  cannot  find  everyone,  but  can  develop 
defensible  estimates  of  the  numbers  of  persons  whom  we 
cannot  find,  should  the  census  figures  be  adjusted  in  light 
of  these  estimates? 

No  doubt  you  are  familiar  with  my  own  view.  Last  fall,  I 
introduced  a  bill  in  the  Senate  to  instruct  the  Secretary  of 
Commerce  to  adjust  the  population  figures  of  the  1980 
census  to  correct  for  the  undercount,  and  to  require  every 
Federal  official  who  administers  a  program  under  which 
money  is  distributed  according  to  population  data  to  use 
these  corrected  figures. 

My  bill  was  in  part  a  response  to  the  conclusions  of  the 
Panel  on  Decennial  Census  Plans,  a  distinguished  group  of 
statisticians  appointed  by  former  Secretary  Kreps,  that 
".  .  .  (T)he  issue  of  equity  cries  out  for  attention.  .  .  .  (T)he 
Census  Bureau  would  be  able  to  respond  in  an  appropriate 
and  competent  fashion  to  a  directive  to  adjust  the  State  and 
local  population  data  for  the  undercount.  .  .  ."  Sensitive  to 
the  Panel's  further  judgment  that  "whether  to  adjust  the 
census  estimates  is  largely  a  policy  issue;  how  to  do  it  is 
primarily  a  technical  one,"  (emphasis  added)  I  sought  to 
provide  a  basis  for  resolving  the  policy  issue  and  producing 
the  appropriate  "directive"  to  the  Bureau.  But  legislation 
is  not  actually  required  for  this  purpose.  The  Secretary  or 
the  President  could  issue  the  necessary  directive.  Indeed, 
the  Bureau  could  furnish  the  indicated  information,  without 
any  directive. 

There  are  three  questions  that  should  be  addressed  in  the 
course  of  this  conference.  Let  me  take  them  up  one  at  a  time. 
First,  what  does  the  Constitution  require?  As  I  read  Article 
I,  Section  2,  and  the  14th  Amendment,  there  are  two  pro- 
visions. The  "whole  number  of  persons  in  each  State,  exclud- 
ing Indians  not  taxed"  shall  be  determined  every  10  years. 
And  the  House  of  Representatives  "shall  be  apprortioned 
among  the  several  States  according  to  their  respective 
numbers." 

It  is  well  established  that  the  responsibility  of  the  census 
is  to  count  the  number  of  persons  to  be  found  in  each  State. 
It  matters  not  whether  they  are  voters  or  nonvoters,  adults 
or  children,  citizens,  lawfully-admitted  foreigners  or  "illegal" 
aliens,  able-bodied  or  handicapped,  English-speaking  or  not, 
self-supporting  workers  or  dependent  persons.  Every  human 
being  physically  present  within  the  borders  of  a  State  on 
April  1— and  some  not  present— is  to  be  counted,  and  the 
House  of  Representatives  is  thereafter  reapportioned  accord- 
ing to  that  count. 

The  Constitution  is  silent  on  the  question  of  the  "under- 
count." One  can  reasonably  assume  that  the  founding  fathers 
never  anticipated  the  issue.  Theirs  was  a  thinly  populated  and 
primarily  agrarian  society  in  which,  to  exaggerate  only 
slightly,  everyone  in  a  community  knew  everyone  else  by 
sight  if  not  by  name.  It  was  deemed  a  relatively  simple 
matter  to  count  them. 

But  the  founding  fathers  were  wise  enough  to  recognize 


that  they  could  not  anticipate  every  contingency,  so  Article 
I,  Section  2  further  provides  that  the  "actual  enumeration 
shall  be  made.  .  .in  such  Manner  as  (the  Congress)  shall  by 
Law  direct." 

The  current  law  provides  as  follows: 

The  Secretary  shall,  in  the  year  1980  and  every  10 
years  thereafter,  take  a  decennial  census  of  population  as 
of  the  first  day  of  April  of  such  year.  .  .in  such  form  and 
content  as  he  may  determine,  including  the  use  of  sam- 
pling procedures  and  special  surveys.  In  connection  with 
any  such  census,  the  Secretary  is  authorized  to  obtain 
such  other  census  information  as  necessary. 

Once  again,  no  mention  is  made  of  the  "undercount" 
issue,  but  the  law  is  certainly  permissive  with  respect  to  the 
procedures  by  which  the  Secretary  shall  enumerate  the  pop- 
ulation. I  am  not  a  lawyer,  but  it  seems  clear  enough  that  the 
phrases  "in  such  form  and  content  as  he  may  determine"  and 
"including  the  use  of  sampling  procedures  and  special 
surveys,"  are  meant  to  provide  the  Census  Bureau  with  suf- 
ficient flexibility  to  produce  as  accurate  and  complete  a 
count  as  possible. 

The  second  question  to  be  addressed  is  the  availability  and 
reliability  of  methods  by  which  completeness  and  accuracy 
can  be  enhanced,  assuming  now— and  I  do  assume— that  it 
will  never  be  possible  to  obtain  a  complete  enumeration 
through  traditional  census  procedures.  In  this  area,  I  defer 
to  the  immense  sophistication  of  those  gathered  here,  adding 
only  my  own  impression,  as  one  who  has  worked  with  census 
data  for  many  years  and  has  a  passing  acquaintance  with 
statistical  methods  and  techniques,  that  many  such  methods 
are  available  and  sufficiently  reliable,  and  that  others  can  be 
developed  with  relative  ease.  Clearly,  precision  and  reliability 
will  suffer  as  one  gets  into  th'e  finer  grained  tables.  It  may  be 
fruitless  to  seek  to  apply  estimating  techniques  and  statistical 
corrections  to  comparisons  of  the  level  of  school  achievement 
in  32-year-old  unmarried  women  residing  in  adjoining  census 
tracts.  It  could  be  done,  but  not  with  sufficient  reliability  to 
make  it  worth  doing.  So  judgments  will  have  to  be  made— 
and  made  by  persons  such  as  yourselves— about  the  specific 
demographic  facts  that  lend  themselves  to  such  techniques. 
At  the  very  least,  it  will  be  necessary  to  make  clear  to  those 
unfamiliar  with  statistical  techniques  which  numbers  and 
which  differences  are  significant  and  to  what  degree.  But 
many  of  the  fundamental  facts  about  our  population— 
notably  how  many  people  live  where— can  be  estimated  with 
a  high  degree  of  reliability.  And  in  my  view  they  should  be. 

The  third  question  is,  what  uses  should  be  made  of  the 
estimated  population  data  as  opposed  to  the  enumerated 
population  data?  I  would  retain  the  distinction.  Indeed,  I 
would  have  the  Census  Bureau  produce  two  sets  of  figures. 
It  is  not  the  Bureau's  responsibility  to  determine  which 
should  be  used  in  which  circumstances.  I  for  one  do  not 
believe  (although  as  a  New  Yorker  this  is  a  statement  against 
interest)  that  the  adjusted  figures  should  be  used  to  re- 
apportion seats  in  the  House  of  Representatives.  One  ought 
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not  tamper  with  a  fundamental  constitutional  procedure  that 
has  served  us  since  1790.  I  am  firmly  of  the  view  that  the 
population  count  that  is  used  for  apportioning  the  Congress 
ought  to  be  one  in  which  every  person  is  able  to  be  associ- 
ated with  a  name  and  an  address,  not  with  a  statistical 
estimate. 

I  have  no  such  reservations  about  applying  the  "second 
set"  of  census  numbers  to  Federal  spending  programs,  how- 
ever. To  the  contrary,  the  decennial  enumeration  has  been 
inadequate  for  such  purposes  for  a  long  time,  and  it  is  often 
necessary  to  make  interim  estimates  of  one  kind  or  another. 
The  point  of  all  such  estimates  is  to  ensure  that  a  Federal 
program  meant  to  alleviate  a  particular  problem  or  condition 
is  administered  in  accord  with  the  most  accurate  and  timely 
information  we  can  obtain  about  the  actual  incidence  of  that 
problem  or  condition.  It  matters  not  whether  the  program 
pays  for  health  care,  social  services,  compensatory  education, 
eases  the  fiscal  burden  on  State  and  local  governments,  cares 
for  the  handicapped,  or  subsidizes  housing  for  the  needy. 
Insofar  as  the  basis— or  a  basis— for  apportioning  Federal 
funds  throughout  the  country  is  demographic  data  supplied 
by  the  Census  Bureau,  that  data  should  be  up-to-date  and 
complete.  Completeness  requires  that  we  adjust  for  the 
undercount,  just  as  we  adjust  for  population  shifts  from  one 
area  to  another.  This  can  be  done  and  in  my  view,  justice 
and  equity  require  that  it  be  done. 

The  alternative   is  the   continuing    politicization  of  the 


census  itself,  and  a  steady  decline  in  public  confidence  in  it. 
If   members   of  any  group  believe  that  their   numbers  are 
understated  by  the  census,  they  will  inevitably  (and  in  my 
view  understandably)   press  for  ad  hoc  adjustments  to  be 
made  in  the  formulas  by  which   Federal  funds  are  meted 
out;  they  will  seek  special  enumerations  of  their  group;  and 
they   will    increase   the   political    pressures  on  the   Census 
Bureau   to   be   something    less   than  the  respected,   neutral 
agency  that  it  is,  even  as  they  seek  to  discredit  the  data  that 
it  produces.  I  cannot  doubt  that  will  happen,  for  as  a  Senator 
from    New    York,    I   would    be   obliged  to  respond   in  not 
dissimilar  fashion  if   I   had   reason  to  believe  that  my  own 
constituency  was  being  disadvantaged  by  the  census  under- 
count. It  is  simply  the  fact  that  billions  of  dollars  are  now 
redistributed  among  various  parts  of  the  Nation  on  the  basis 
of  census  data.  If  we  fail  to  establish  the  proposition  that 
those  data  are  as  accurate  and  complete  as  the  techniques 
of  enumeration  and  estimation  can  make  them,  the  ensuing 
politicization  will  damage  the  census  even  as  it  jeopardizes  the 
objectivity  on  which  these  myriad  formulas  tenuously  rest. 
We  will  seek  to  minimize  the  undercount,  but  we  will 
never    eliminate   it.  The  challenge  to  this  conference,  and 
to  the  Secretary  of  Commerce  and  the  Bureau  of  the  Census 
in  its  aftermath,  is  to  do  the  next  best  thing:  To  make  every 
effort  that  can  be  made  within  the  bounds  of  sound  statis- 
tical methodology  to  estimate  the  undercount  and  to  publish 
the  results  of  those  estimates. 


Methodological 
Considerations  I 


Can  Regression  Be  Used  To  Estimate 
Local  Undercount  Adjustments? 


Eugene  P.  Ericksen 

Temple  University 


ISSUES  AND  OBJECTIVES 

Census  undercount  has  become  a  public  issue  in  1980. 
The  political  and  fiscal  implications  of  the  possibility  that 
levels  of  undercount  vary  between  places  has  increased  the 
pressure  on  the  Census  Bureau  to  compute  local  estimates 
of  the  number  of  people  missed  in  the  decennial  census. 
When  considering  whether  and  how  to  do  this,  the  Bureau 
needs  to  find  a  method  of  establishing  whether  the  under- 
count does  vary  between  places,  whether  the  variations  are 
systematic,  and  whether  the  nature  of  these  variations  can  be 
estimated.  Doing  this  successfully  appears  to  require  match- 
ing administrative  records  against  census  forms  for  large 
probability  samples  of  States  and  large  metropolitan  areas. 
The  administrative  records  must  enumerate  the  uncounted 
population,  the  matching  procedure  must  be  feasible,  and  the 
sampling  design  must  be  sufficiently  precise  to  provide  re- 
liable estimates  for  a  variable  whose  actual  values  between 
places  will  not  vary  greatly.  Descriptions  of  the  Census 
Bureau's  plans  are  given  in  [1].  As  they  describe  them, 
the  procedures  appear  to  be  neither  easy  nor  cheap,  the 
main  problem  being  one  of  matching  people  at  the  same 
address.  Computing  these  sample  estimates  of  local 
undercount  (a  most  difficult  procedure)  is  the  critical 
link  in  the  computation  of  undercount  estimates.  Once 
the  sample  estimates  are  at  hand,  we  have  a  variety  of 
techniques  using  symptomatic  indicators  to  compute  final 
estimates.  Leaving  the  difficult  problem  of  actually  com- 
puting the  sample  estimates  to  the  Bureau,  my  objective 
here  is  to  discuss  ways  to  combine  the  estimates  with  aux- 
iliary information  to  derive  the  final  estimates  of  local  under- 
count. 

A  review  of  recent  papers  by  Census  Bureau  personnel  [1 , 
15,  18]  indicates  a  recognition  of  the  statistical  and  political 
problems  associated  with  undercount  estimates.  The  under- 
count adjustment  procedure  needs  to  be  statistically  sound 
and  politically  credible.  If  the  Census  Bureau  is  going  to 
part  from  its  tradition  of  publishing  figures  based  only  on 
people  actually  counted,  it  must  have  firm  and  noncontro- 
versial  ground  to  stand  on.  The  undercount  adjustment 
procedure  should  be  correct  and  easily  understood,  and  be 
capable  of  defense  by  knowledgeable  statisticians.  The  need 
to  withstand  legal  challenges  calls  for  a  sound  but  conserva- 
tive procedure  rather  than  a  potentially  more  perfectible  one. 

There  are  many  ways  to  classify  undercount  adjustment 
procedures,  and  I  would  like  to  discuss  two.  One  classifica- 
tion sorts  procedures  into  "individualistic"  and  "ecological" 
categories.  Most  individualistic  procedures  would  result  in  a 


synthetic  estimate  [10,  13;  but  see  12  and  14  for  useful 
evaluations  and  comparisons  with  other  methods].  One  first 
estimates  individual  likelihoods  of  being  missed.  This  would 
be  done  for  aggregates  of  individuals  classified  by  demo- 
graphic variables  like  age,  race,  and  sex;  a  good  example  of 
this  is  the  national  undercount  estimates  given  in  [16]. 
Multiple  regression,  as  illustrated  in  [3] ,  could  be  used  to 
incorporate  more  variables  into  the  estimate,  though  log- 
linear  models  [2]  may  be  a  more  appropriate  technique. 
Once  the  rates  are  computed  for  subgroups  of  individuals, 
these  would  be  assumed  to  be  constant  across  localities,  and 
the  subgroup-specific  rates  would  be  applied  to  the  demo- 
graphic structure  of  each  areal  unit  to  provide  an  overall 
estimate  of  undercount  for  that  place.  The  approach  is 
intuitively  plausible,  and  will  work  unless  there  are  peculiarly 
local  effects  that  cannot  be  incorporated  into  the  estimate. 
Two  examples  of  such  local  effects  are:  (1)  If  age-race-sex 
estimates  such  as  those  computed  by  Siegel  are  applied  to  the 
age-race-sex  distributions  of  all  localities,  but  both  blacks  and 
whites  are  less  likely  to  be  counted  in  central  cities  than 
suburbs,  and  (2)  if  there  are  variations  in  the  efficiency  with 
which  local  census  offices  collect  forms  in  "hard-to-count" 
areas. 

By  contrast,  with  the  "ecological  approach"  we  don't 
attempt  to  compute  rates  for  individuals,  but  simply  com- 
pute estimated  rates  for  localities  and  look  for  systematic 
variations  in  these  aggregated  rates  using  a  procedure  like 
linear  regression.  We  might  find  that  estimated  rates  for  local- 
ities increase  where  there  are  large  black  populations,  the 
locality  is  a  central  city,  or  there  is  a  large  proportion  of 
multihousing-unit  structures.  Such  an  estimate  would  not 
imply  that  blacks  were  harder  to  count,  simply  that  locali- 
ties where  blacks  lived  had  a  greater  undercount.  The  missed 
population  could,  for  example,  be  Puerto  Ricans. 

Another  way  to  classify  methods  is  into  "simple"  and 
"complex"  categories.  Both  synthetic  and  regression  esti- 
mates are  complex,  as  their  credibility  rests  on  lists  of 
assumptions  concerning  distributions  of  variables  and  rela- 
tionships among  variables.  The  set  of  sample  estimates  for 
localities  is  an  obvious  candidate  for  a  simple  estimate.  Let 
us  consider  their  use,  specifying  the  variable  in  question  to 
be  the  ratio  of  total  to  enumerated  population  and  assum- 
ing that  the  estimates  have  values  perceptibly  greater  than 
1.000,  that  the  coefficient  of  variation  is  0.005  or  less,  and 
that  nonsampling  errors  are  minimal.  In  other  words,  the 
procedures  described  in  [1]  have  been  successfully  applied. 

These  estimates  would  work  for  States.  This  is  important 
because   States   are   the   first   level  of  funding  for   Federal 
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revenue-sharing  programs,  and  grants  to  States  comprise  a 
large  proportion  of  such  spending.  The  estimates  would  fare 
less  well  below  the  State  level.  Even  if  good  sample  estimates 
were  computed  for  large  metropolitan  areas,  since  these  are 
not  governmental  units,  we  would  have  only  one  estimate 
to  be  applied  to  all  jurisdictions  within  the  standard  metro- 
politan statistical  area  (SMSA).  Since  we  expect  to  find  rates 
of  undercount  higher  in  largely  black  central  cities  than  pre- 
dominantly white  suburbs,  a  single  adjustment  for  both  is 
implausible  and  will  do  little  to  alleviate  the  political  pressure 
calling  for  undercount  adjustments. 

Secondly,  the  sample  estimates  are  not  easy  to  compute. 
If  the  sample  estimates  are  derived  through  a  particularly 
laborious  procedure,  they  may  no  longer  qualify  as  "simple" 
estimates,  and  estimating  their  error  structure  could  be  prob- 
lematic. Records  from  the  Internal  Revenue  Service  are  to  be 
used  in  the  administrative  record  matching  and  we  know  that 
tax  returns  miss  people,  especially  the  poor  and/or  mobile 
who  are  not  likely  to  be  counted  in  a  census.  The  record- 
keeping systems  necessary  for  triple  system  estimation  are 
horrendous  to  maintain,  and  matching  problems  add  a  non- 
sampling  error  component  to  the  mean  squared  error  of  the 
sample  estimates.  Thirdly,  even  if  the  mean  squared  error  is 
random  and  small,  some  larger  errors  will  occur  by  chance 
alone.  If  5  percent  of  the  sample  estimates  are  two  standard 
deviations  away  from  true  values,  and  this  is  an  unacceptable 
level  of  error,  not  knowing  which  observations  are  the  5 
percent  weakens  the  credibility  of  all  estimates. 

identifying  the  outliers  requires  an  auxiliary  estimate 
which  is  independent  of  the  sample  estimate  [7] ,  and  the 
synthetic  and  regression  estimates  we  have  discussed  are  two 
prime  candidates.  A  good  review  of  available  auxiliary  esti- 
mates is  given  in  [14] .  The  auxiliary  estimates  are  needed  at 
least  as  a  check  on  the  sample  estimates.  If  the  errors  of  the 
auxiliary  estimates  are  equal  to  or  smaller  than  those  of  the 
sample  estimates,  the  auxiliary  estimates  would  be  preferred, 
since  they  can  be  applied  to  a  wider  variety  of  places. 

They  have  several  advantages.  Local  sampling  fluctuations 
are  absent.  We  can  use  the  sample  estimates  to  test  the  good- 
ness of  fit  of  alternative  models,  and  the  model  selection 
process  will  help  us  to  learn  about  the  conditions  where 
census  undercount  is  more  or  less  likely  to  occur.  The  ex- 
planation provided  by  the  selected  model  increases  the 
credibility  of  the  model.  We  can  use  the  sample  estimates  to 
estimate  the  error  of  the  auxiliary  estimates  and  make  an 
empirical  case  for  the  selected  model  based  on  minimizing 
average  squared  error. 

This  lessens  the  main  disadvantage  of  synthetic  and  re- 
gression estimates,  which  is  that  they  are  biased.  Regardless 
of  sample  size,  the  expected  value  of  a  synthetic  or  regression 
estimate  for  a  locality  is  unlikely  to  equal  the  expected  value 
of  an  unbiased  sample  estimate.  The  extent  that  local  condi- 
tions are  atypical  is  important,  and  though  testing  for  the 
goodness  of  fit  can  indicate  that  the  bias  is  small  in  general, 
it  can  be  large  in  a  particular  case.  This  problem  of  using 


regression  for  local  estimation  has  been  illustrated  for  1970 
population  estimates  for  counties  [6] .  In  general,  errors  were 
low,  but  some  very  large  errors  were  obtained  for  counties 
w,th  unusual  age  distributions. 

MODELS  OF  UNDERCOUNT  AND  THE  UNIT 
OF  ANALYSIS 

Since  we  don't  expect  to  rely  on  sample  estimates  alone, 
we  need  to  use  the  auxiliary  information  to  select  a  model  of 
undercount.  We  need  a  parsimonious  model  to  sell  the  esti- 
mates, which  means  that  we  don't  choose  a  complex  model 
unless  there  is  convincing  evidence  that  a  simpler  one  should 
be  ruled  out.  Here  are  three  models,  which  can  be  arranged 
in  order  of  increasing  complexity,  that  can  be  considered: 
(1)  All  areas  assumed  to  have  equal  levels  of  undercount;  (2) 
the  undercount  of  an  area  can  be  accounted  for  by  its  demo- 
graphic structure  and  by  national  undercount  rates  estimated 
separately  for  demographic  categories  like  age,  race,  and  sex; 
and  (3)  there  are  local  variations  possibly  to  be  accounted  for 
by  regression.  If  the  sample  data  fit  model  (1),  we  select  it. 
Failing  this,  we  give  preference  to  model  (2)  over  model  (3). 

Let  us  introduce  notation  to  describe  the  models.  Each 
person  is  in  a  demographic  subgroup  /'  and  lives  in  areal  unit 
/'  in  block  k.  We  have  a  simple  random  sample  of  n  blocks 
in  each  of  the  areal  units  and  each  block  is  enumerated 
completely.  For  each  /,/,  k  we  compute 


where 
xijk 


rijk  *  ~  yijk  */xi/k 


the  number  of  people  in  subgroup  /  in  block  k 
of  areal   unit  /  who  are  counted  in  the  census 

the   adjusted   count   after   matching  procedures 
have  been  completed. 


We  also  write  x  ■  =  the  total  census  count  in  areal  unit  j, 
and  we  would  like  to  know  the  comparable  value  of  y  -. 
We  write  x  ,  the  census  count  for  the  Nation  and  y  ,  trie 
adjusted  count  for  the  Nation.  We  assume  that  /  equals  the 
adjusted  count  that  would  be  derived  using  procedures  given 
in  [16]  .  Under  model  (1),  we  would  estimate  the  adjusted 
count  for  each  block  to  be 


Vjk=VXijk** 


where  r     =  y     lx      .  The  adjusted  count  tor  areal   unit  / 
would  be: 

y.j'=(x.j){r.) 

Since  we  have  a  simple  random  sample  of  blocks  in  each  areal 
unit,  we  can  perform  an  analysis  of  variance  with  the  areal 
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units  specified  as  treatments  and  the  blocks  as  observations. 
We  compute: 


7* 


ii 


and  perform  the  analysis  of  variance  on  the  r-.  *. 

Legal  and  financial  decisions  based  on  decennial  census 
counts  of  local  populations  have  been  made  as  if  model  (1) 
were  correct.  A  clear  establishment  of  the  fact  of  differential 
undercount  is  a  necessary  justification  for  the  computation 
and  use  of  local  area  estimates.  Since  we  know  that  the 
undercount  varies  across  demographic  groups  and  that  demo- 
graphic structures  of  places  also  vary,  we  expect  to  reject 
model  (1).  To  be  sure  that  our  analysis  of  variance  test 
fairly  rejects  the  null  hypothesis  associated  with  this  model, 
we  should  replicate  it  for  small  subgroups  of  States  and 
metropolitan  areas  grouped  by  region  or  some  other  char- 
acteristic. Undercount  adjustments  should  not  be  computed 
if  place-to-place  variations  are  small. 

This  done,  we  next  test  model  (2).  A  claim  for  the  correct- 
ness of  model  (2)  would  state,  for  example,  that  blacks  are 
consistently  harder  to  count  than  whites  in  the  same  juris- 
diction. The  counterargument  would  claim  that  racial  differ- 
entials result  from  the  nature  of  where  blacks  and  whites  live, 
with  blacks  being  concentrated  in  areas  that  are  hard  to 
enumerate.  In  such  areas,  they  may  be  no  harder  to  count 
than  nonblacks.  Model  (2)  can  be  tested  using  analysis  of 
co  variance. 

We  write  the  following 


Yjk     =  fi/k 


and 


y,k"=^i/k" 


where 


yak' = n. *i}k  and ',-.  = 2 f us? l*;jk* 

1    k  j    k 


We  can  use  an  analysis  of  covariance  model  [17,  pp.  420-425] 
to  express  the  y-^  *  as  follows 

y/k*  =  »j+b%-k"-y'.:)+% 

where, 


the   effect   of   location    in   areal   unit  /  once  the 

effect   of  the   synthetic   estimate  y-,"  has  been 

removed, 

the    regression    coefficient   expressing   the   effect 

of  yk"  on  the  values  of  /..  *  once  the  locality 


b      = 


eik 


effects  have  been  removed,  and 
=    a  random  error  term. 


We  compute  a  pooled  estimate  of  b  over  the  samples  of 
blocks  within  each  areal  unit  and  use  as  an  estimate  of  the 
effect  of  location  in  areal  unity: 


<V =  yi. 


b(7j"-y/') 


Yjk 


Ik 


An  F  test  of  the  equality  of  the  n-'s  can  then  be  computed. 
If  the  test  fails,  we  conclude  that  a  model  whereby  there  is 
no  systematic  set  of  local  "effects"  is  not  inconsistent  with 
the  data.  In  other  words,  once  we  know  what  the  synthetic 
estimates  for  the  sample  of  blocks  are,  we  gain  no  additional 
information  from  knowing  the  identity  of  the  areal  unity. 

If  we  reject  both  the  tests,  our  task  becomes  that  of  iden- 
tifying other  relevant  factors  affecting  the  undercount.  The 
regression-sample  data  procedure  [4,  5]  provides  a  way 
of  doing  this,  and  it  can  be  used  on  its  own  or  along  with 
synthetic  estimation.  Any  synthetic  estimate  can  be 
tested  by  analyzing  its  correlation  with  the  sample  esti- 
mates or  by  including  it  in  a  regression  equation  along  with 
other  predictors  observed  to  be  related  to  the  sample  esti- 
mates of  undercount. 

The  unit  of  sampling  becomes  an  issue  for  regression  esti- 
mates. Current  census  plans  appear  to  call  for  the  sampling 
of  large  metropolitan  areas  and  those  States  and  remainder 
of  States  outside  these  SMSA's.  Because  these  areas,  agglo- 
merations of  many  local  jurisdictions,  tend  to  be  hetero- 
geneous with  respect  to  race,  the  variance  of  r  -,  the  ratio  of 
adjusted  to  counted  population,  is  likely  to  be  minimal.  To 
illustrate,  suppose  that  8  percent  of  blacks  and  2  percent  of 
nonblacks  are  missed  in  the  census  count.  If  we  apply  these 
rates  to  the  1970  racial  distributions  of  States  and  assume 
that  no  other  factors  influence  the  undercount,  we  derive 
values  of  r  ■  ranging  from  1.020  (9  States  each  of  whose  pop- 
ulation is  less  than  1  percent  black)  to  highs  of  1.042  in 
Mississippi  and  1.038  in  Louisiana  and  South  Carolina.  The 
mean  of  this  distribution  is  1.025  and  its  variance  is  30.52  x 
10" 6.  A  similar  calculation  for  the  33  SMSA's  with  over  1 
million  population  in  1970  gives  a  range  of  1.021  (four 
SMSA's)  to  1.039  (New  Orleans)  and  1.037  (Baltimore), 
a  mean  of  1 .027  and  a  variance  of  22.24  x  10. 

An  alternative  pian  would  subdivide  these  agglomerations 
into  more  homogeneous  units.  For  example,  instead  of 
having  the  33  equivalent  samples  of  metropolitan  areas,  we 
would  have  33  samples  of  central  cities,  33  samples  of  their 
suburbs,  and  a  total  of  66  units  of  observation,  each  with 
comparable,  though  smaller,  sample  sizes.  Cutting  the  sample 
size  in  half  will  double  the  sampling  variance,  but  this  could 
be  more  than  compensated  for  by  the  increased  explanatory 
power  of  the  predictor  variables.  We  repeated  our  calcula- 
tions based  on  the  8-  and  2-percent  noncoverage  rates  for  the 
66  central  city /suburban  units  and  obtained  a  range  of  1 .020 
(6  suburbs)  to  1.063  (Washington,  D.C.)  and  1.052  (Newark), 
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a  mean  of  1.029  and  a  variance  of  89.06  x  10  ,  four  times 
that  obtained  for  the  33  SMSA's.  If  it  were  feasible  to  com- 
pute, a  regression  equation  based  on  the  66  units  would  be 
applicable  to  a  greater  range  of  places  than  would  a  regres- 
sion equation  based  on  the  33  units.  This  is  because  we  could 
evaluate  a  greater  range  of  values  of  racial  composition  and 
could  explicitly  estimate  the  effects  of  central  city  location. 
Let  us  now  discuss  the  methodology  of  regression  in  more 
detail. 

THE  USE  OF  REGRESSION  TO  COMPUTE 
LOCAL  ESTIMATES  OF  CENSUS  UNDERCOUNT 

The  methodology  of  the  regression  procedure  consists 
of  first  obtaining  a  set  of  sample  estimates  for  the  areal  units 
under  consideration  and  then  estimating  their  variances.  We 
next  obtain  a  set  of  variables  thought  to  be  related  to  the 
dependent  variable  under  consideration,  and  compute  a  re- 
gression equation  using  these  auxiliary  variables  as  predictors. 
The  auxiliary  variables  can  be,  and  often  are,  alternative  esti- 
mates of  the  dependent  variable  in  question,  and  correlations 
between  the  predictors  and  true,  but  unobserved,  values  of 
the  dependent  variable  have  often  been  between  0.90  and 
1.00  when  evaluations  on  test  data  have  been  made.  Be- 
cause of  the  within  primary  sampling  unit  (PSU)  error,  the 
observed  correlations  are  lower,  but  if  the  usual  assumptions 
of  regression  analysis  are  met,  the  regression  coefficients 
obtained  are  unbiased  estimates  of  the  coefficients  that 
would  be  obtained  if  the  true  values  of  the  dependent  vari- 
able were  available.  The  mean  squared  error  of  the  regres- 
sion estimate  can  be  written 


MSE  =  {n-p 


1)  a  J  In  +  (p  +  1)  aw2  In 


where. 


w 
n 


the  variance  of  the  true,  but  unobserved  values 

not  explained  by  regression, 

the  sampling  variance  of  the  sample  estimates, 

the  number  of  observations,  and 

the  number  of  predictor  variables. 


The  a  can  be  written  as  a  =  a .  (1  -  R  ),  where  a. 
is  the  original  variance  of  the  true  values  and  R2  is  the  coef- 
ficient of  determination  that  would  be  obtained  were  the 
true  values  available.  Our  expression  for  the  mean  squared 
error  is  not  to  be  confused  with  the  mean  squared  difference 
between  regression  and  sample  estimates,  which  is: 


2(/0-  Y)2ln  =   [(n-p 


It  should  be  noted  that: 


D/"]  <%2+o 


MSE  =  [Z(V0  -  Yf/n]    -  (n-2p-2)ow2ln 
We  can  readily  see  that  the  mean  squared  error  of  the  re- 


gression estimates  is  less  than  the  mean  squared  error  of  the 
sample  estimates  if  o2  <  oj2 .  For  estimating  the  census 
undercount,  if  the  variances  of  sample  estimates,  a2,  are 
unexpectedly  high,  regression  becomes  a  more  attractive 
alternative,  since  the  underlying  relationships  between  vari- 
ables are  not  affected.  If  the  correlations  between  synthe- 
tic estimates  used  as  predictor  variables  are  as  high  as  we 
would  reasonably  expect  them  to  be,  we  would  almost 
certainly  pick  regression  over  sample  estimates. 

Our  equation  for  the  mean  squared  error  can  be  used  to 
evaluate  choices  to  be  made  in  computing  regression  esti- 
mates. Among  these  choices  are: 

(a)  When  is  it  advisable  to  add  additional  predictor  vari- 
ables to  the  regression  equation? 

(b)  Is  it  better  to  use  fewer  observations  with  smaller 
sampling  variances  or  more  observations  with  larger 
sampling  variances? 

To  illustrate  how  these  choices  might  be  made,  we  present 
(table  1)  examples  of  mean  squared  error  computations. 
These  have  been  computed  for  a  variety  of  circumstances. 
We  assume  a  good  synthetic  estimate  to  be  available  as  the 
first  predictor  variable.  Sample  correlations  of  this  with  the 
actual  (but  unobserved)  values  of  the  ratios  of  the  true  to 
counted  populations  are  given  in  the  first  column,  followed 
by  estimates  of  the  correlations  that  would  actually  be 
observed  for  three  different  values  of  the  sampling  variances. 
The  lowest,  25  x  10  ,  is  approximately  what  I  understand 
to  be  the  goal  of  the  Census  Bureau's  sample  design.  Esti- 
mates of  the  mean  squared  error  and  its  components  under 
these  conditions  are  then  given.  This  is  followed  by  esti- 
mates of  increases  in  the  coefficient  of  determination  with 
the  true  (but  unobserved)  values,  which  would  have  to  be 
obtained  to  make  it  useful  to  add  a  second  predictor  variable. 
The  exercise  is  then  repeated  with  the  number  of  observa- 
tions and  sampling  variances  each  doubled.  The  variances, 
a^2 ,  of  the  underlying  values  of  r  ■  are  multiplied  by  four 
in  an  attempt  to  replicate  what  would  happen  if  we  used 
66  central  city/suburban  units  instead  of  33  SMSA's.  The 
results  are  given  in  table  1  and  can  be  summarized  as  follows: 

1.  The  observed  correlations  are  dampened  substantially 
by  the  sampling  variances.  They  appear  in  fact  to  be 
affected  more  by  the  size  of  the  sampling  variances 
than  by  the  strength  of  the  underlying  relationships. 
But,  observing  a  low  correlation  doesn't  mean  that 
regression  has  failed.  We  need  to  look  at  the  mean 
squared  error  to  make  that  judgment. 

2.  For  given  values  of  o2,  increases  in  a2  led  to  a  larger 
mean  squared  error  of  the  regression  estimates.  The 
gains  of  regression  relative  to  sample  estimates  are 
greater  when  the  sampling  variances  are  large.  In  only 

one    case,    where    a,2  =  100  x  10~6,    r2  =  .500,    and 

2  -6 

a       =  50  x  10     ,  is  the   mean   squared    error   of  the 
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Table  1.  Mean  Squared  Errors  of  Regression  Estimates  Given  Various  Combinations  of  Variances  and  Sample  Sizes 

a2  =25x10" 6,  n  =  33,  p=  1 


Actual 
r2 

°u 

Observed 

r2 

n 

"O 

Mean  squa 
error 

red 

If  p  increased  to  2, 
R2  now  needed  to 
obtain  same  mean 

(x  106) 

(x  106 

> 

(x  106] 

squared  error1 

Within-PSU 

variance  x  106 

25 

100 

400 

25       100 

400 

25 

100 

400 

25 

100           400 

.90 

2.5 

.48 

.23 

.11 

1 .4        5.9 

24.1 

3.9 

8.4 

26.6 

.93 

NP             NP 

.80 

5.0 

.44 

.21 

.10 

2.1        5.8 

23.9 

6.2 

10.8 

28.9 

.83 

.92             NP 

.70 

7.5 

.39 

.19 

.10 

1.1        5.6 

23.8 

8.6 

13.1 

31.3 

.72 

.82             NP 

.50 

12.5 

.30 

.15 

.09 

0.8        5.3 

23.5 

13.3 

17.8 

36.0 

.52 

.61             .98 

a.2  =  100  x  10  6,  n  =  66,  p=  1 


Within-PSU 

variance  x  106 

50 

200 

800 

50 

200 

800      50 

200 

800 

50 

200 

800 

.975 

2.5 

.66 

.35 

.14 

1.4 

6.0 

24.2     3.9 

8.5 

26.7 

.982 

NP 

NP 

.950 

5.0 

.64 

.34 

.13 

1.4 

5.9 

24.1      6.4 

10.9 

29.1 

.957 

.980 

NP 

.925 

7.5 

.62 

.33 

.13 

1.3 

5.8 

24.0     8.8 

13.3 

31.5 

.932 

.955 

NP 

.900 

10.0 

.61 

.32 

.13 

1.2 

5.8 

23.9    11.2 

15.8 

33.9 

.906 

.929 

NP 

.800 

20.0 

.55 

.29 

.12 

0.9 

5.5 

23.6   20.9 

25.5 

43.6 

.805 

.828 

.920 

.700 

30.0 

.48 

.26 

.11 

0.6 

5.2 

23.3  30.6 

35.2 

53.3 

.703 

.726 

.818 

.500 

50.0 

.35 

.19 

.08 

0.0 

4.5 

22.7   50.0 

54.5 

72.7 

.500 

.523 

.615 

NOTE:  The  mean  square 

d  error  of  the  re< 

jressio 

t  estimates  can  be  written  as:  MSE  =  a 

2  +(p 

+  1)(ct    '■ 

-o   2). 

1  This  refers  to  the  same  mean  squared  error  obtained  for  p  =  1  when  a  second  predictor  variable  is  added  to  the  regression  equa- 
tion. Where  "NP"  is  indicated,  the  correlation  would  have  to  be  greater  than  1 .0,  so  it  is  impossible  to  obtain  the  same  mean  squared 
error. 


regression  estimate  as  great  as  a    2  and  doesn't  exceed 
it. 

3.  Reductions  in  a  2  are  offset  by  increases  in  ow2  when 
predictor  variables  are  added  to  regression.  Looking  at 
the  last  three  columns  of  the  table,  we  find  that  under- 
lying value  of  R2  needed  to  reduce  the  mean  squared 
error  are  bigger  when  the  sampling  variances  are  larger. 
In  some  cases,  when  the  first  predictor  variable  is 
strongly  related  to  the  dependent  variable,  there  is  no 
gain  from  adding  predictors.  Should  this  occur,  we  may 
want  to  consider  simply  using  the  synthetic  estimate,  if 
it  is  in  fact  the  first  predictor,  since  it  will  not  include 
the  sampling  error  component  of  the  mean  squared 
error  of  the  regression  estimate. 

4.  The  choice  between  the  33  and  66.  areas  depends  on 
the  relative  mean  squared  errors.  We  can  see  that  it  is 
easily  possible  for  reductions  in  a  J"  to  offset  the  in- 
creased a2  when  we  have  more  observations,  each 
with  increased  sampling  variance.  The  fact  that  a  re- 
gression equation  based  on  a  larger  number  of  more 
homogeneous  units  can  be  applied  more  flexibly  will 
induce  us  to  select  it  if  the  mean  squared  errors  are 


comparable  to  those  obtained  with  fewer  but  more 
heterogeneous  units.  The  greater  sampling  variances 
need  not  deter  us  from  this  choice. 

OTHER  CONSIDERATIONS 

The  use  of  regression  is  not  without  problems.  First,  the 
(p  +  1  )ow2 /n  factor  limits  the  number  of  predictor  variables 
that  can  be  used.  Since  the  likelihood  of  being  missed  in  the 
census  is  no  doubt  influenced  by  many  factors,  our  model 
will  be  overly  simple.  In  1970,  while  we  had  seven  indicators 
known  to  be  related  to  population  growth,  we  were  only 
able  to  utilize  four  of  them  in  a  regression  equation  esti- 
mating 1960-1970  population  growth  [5,  6].  This  was  for 
389  observations  with  a  larger  variance  of  the  true 
values  than  is  likely  to  obtain  for  our  dependent  variable 
here.  For  a  more  similar  application,  estimating  unemploy- 
ment rates  for  SMSA's,  Gonzalez,  Hosa,  and  I  [6,  11]  had 
122  observations,  and  the  sampling  variances  were  large. 
The  best  results  were  obtained  with  only  two  of  six  avail- 
able predictors  being  used. 

A  second  important  problem  comes  from  the  small  num- 
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ber  of  observations.  Regression  equations  depend  on  a  set 
of  assumptions  that  can  be  spoiled  in  an  application  such 
as  this  one.  In  particular,  we  must  look  for  possible  relation- 
ships between  the  errors  of  the  sample  estimates  and  the 
true  values.  Does  the  sampling  variance  of  the  undercount 
estimate  get  larger  where  undercount  is  greater?  This  could 
occur  if  high  undercount  areas  are  concentrated  in  a  few 
neighborhoods  in  cities.  Are  there  observations  with  par- 
ticularly large  deviations  from  the  regression  line?  Use  of 
regression  to  estimate  both  population  and  unemployment 
[7,  11]  has  shown  the  effects  of  outliers  due  to  measure- 
ment error  to  be  substantial.  The  need  to  examine  the 
assumptions  of  linear  regression  is  important  here. 

This  problem  is  related  to  that  of  the  universe  of  gen- 
eralization that  occurs  when  equations  are  computed  on 
one  kind  of  unit  and  estimations  are  wanted  for  another 
kind.  The  problem,  pointed  out  by  Fay  [8],  was  illu- 
strated in  our  1970  population  estimates  for  counties 
[5] .  There,  the  regression  equation  was  computed  on 
PSU's  from  the  Current  Population  Survey,  which  were 
heavily  weighted  toward  urban  areas.  Most  counties  are 
rural  and  have  small  populations.  The  equations  based  on 
PSU's  were  less  accurate  for  estimating  population  growth 
for  counties  with  smaller  populations.  Thus,  a  regression 
equation  using  central  cities  and  suburbs  as  units  is  preferable 
to  one  using  SMSA's.  But  if  undercount  estimates  are  wanted 
for  counties  and  other  smaller  jurisdictions,  we  would  prefer 
a  regression  equation  computed  from  a  sample  of  at  least 
similar  units.  The  problem  of  generalization  also  occurs  in  a 
second  way.  If  we  have  50  States,  there  may  be  only  a 
handful  with  a  given  characteristic,  say  a  large  Hispanic 
population,  that  influences  the  level  of  undercount.  The 
influence  could  be  important  for  these  few  States,  but 
including  the  characteristic  as  a  variable  in  regression  might 
fail  to  give  a  substantial  enough  increase  in  the  underlying 
R2  to  offset  the  added  error  of  the  sampling  variance  com- 
ponent. 

With  these  problems,  how  should  we  proceed?  I  suggest 
two  strategies.  One  is  to  spend  some  time  perfecting  a 
synthetic  estimate.  We  will  have  a  very  large  national  sample 
of  households  with  which  to  estimate  the  effects  of  a  large 
set  of  variables  on  the  individual  likelihoods  of  being  missed 
by  the  census,  using  regression  and  log-linear  techniques. 
Individual  probabilities,  based  on  a  large  number  of  factors, 
can  be  computed,  and  these  can  be  aggregated  over  the 
demographic  structure  of  a  place  to  give  an  adjusted  count. 
The  goodness  of  fit  of  such  estimates  could  be  tested  by 
analysis  of  covariance.  These  synthetic  estimates  could  be 
used  as  predictor  variables  in  regression  along  with  other 
predictor  variables.  Should  another  predictor  appear  to  be 
important,  we  might  then  revise  the  synthetic  estimate.  For 
example,  if  we  had  a  two-variable  equation  with  two  im- 
portant predictors,  an  age-race-sex  synthetic  estimate  and  a 
dummy  variable  indicating  location  in  a  central  city,  we 
might  then  try  an  age-race-sex-central  city  or  other  synthetic 


estimate.  The  availability  of  both  synthetic  and  regression 
procedures  gives  us  a  great  deal  of  flexibility  in  computing 
auxiliary  estimates,  and  the  presence  of  sample  estimates 
allows  us  to  evaluate  their  goodness  of  fit. 

Secondly,  how  should  we  deal  with  discrepancies  between 
sample  and  regression  or  synthetic  estimates?  In  general,  if 
the  mean  squared  error  of  the  regression  estimates  is  less 
than  the  sampling  variances,  we  would  reject  those  cases 
where  sample  estimates  were  greatly  different  than  regression 
estimates  and  recompute  the  regression  equation  [7,  11]. 
When  this  was  done  for  1970/1960  population  growth  ratios, 
the  mean  squared  error  of  the  regression  estimates  was  re- 
duced by  17  percent  with  four  predictors  and  3  of  389 
observations  removed.  When  this  was  done  for  1970  unem- 
ployment rate  estimates,  the  mean  squared  error  was  reduced 
by  16  percent  with  two  predictors  and  6  of  122  observations 
removed. 

When  the  mean  squared  errors  of  regression  and  synthetic 
estimates  are  comparable,  James-Stein  weighting  procedures 
[8,  9]  are  appropriate.  We  still  need  to  resolve  the  issue  of 
what  happens  when  the  two  estimates  are  far  apart.  There 
is  some  feeling  that  the  sample  estimate  should  be  given 
precedence  by  not  allowing  the  weighted  estimate  to  differ 
from  it  by  a  specified  amount  [8,  p.  172] ,  say,  one  standard 
deviation  of  the  sample  estimate. 

Selection  of  an  estimate  must  of  course  be  done  after 
examination  of  the  empirical  evidence.  At  this  point,  I  favor 
a  computation  of  a  synthetic  estimate  based  on  as  many 
factors  as  the  individual-level  regression  or  log-linear  analysis 
of  individual  likelihoods  of  being  uncounted  indicate  to  be 
relevant.  Such  estimates  for  large  areas  can  be  evaluated  using 
regression,  as  we  indicated  in  the  previous  section.  Relating 
the  final  estimates  to  individual  likelihoods  of  being  counted 
should  increase  the  credibility  of  final  estimates.  If  the  mean 
squared  error  is  comparable  to  the  sampling  variance,  we  could 
consider  a  weighted  average  with  the  sample  estimate,  al- 
though the  increased  complexity  of  the  estimate  might  offset 
the  advantages  of  allowing  data  actually  collected  in  the 
area  for  which  the  estimate  is  computed  to  influence  this 
estimate. 
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INTRODUCTION 


After  the  1980  U.S.  census,  there  will  be  pressure  to  modi- 
fy the  population  counts  before  they  are  used  in  the  formula 
allocation  of  funds  [2] .  The  Census  Bureau  has  suggested 
the  need  for  modification  by  its  convincing  argument  about 
omission  rates1  in  recent  censuses;  in  1970,  the  black  omis- 
sion rate  (7.7  percent)  was  four  times  the  white  omission 
rate  (1.9  percent)  [5] .  The  Bureau  continues  to  clarify  the 
situation  and  to  develop  data  modification  methods;  July 
1978  was  the  end  of  a  Census-sponsored  National  Research 
Council  [4]  project  that  evaluated  plans  for  the  1980  census. 
The  problems  of  census  count  modification  were  emphasized. 
At  the  same  time,  and  with  the  same  sponsorship,  a  project 
began  on  population  and  income  data  needed  for  small 
areas.  This  project  must  also  include  work  on  census  data 
modification.  A  major  current  Census  Bureau  publication 
on  the  possibility  of  modifying  census  counts  is  [7] ,  Devel- 
opmental Estimates  of  the  Coverage  of  the  Population  of 
States  in  the  1970  Census:  Demographic  Analysis,  which 
will  be  referred  to  as  SPRR,  in  honor  of  the  authors,  Siegel, 
Passel,  Rives,  and  Robinson.  SPRR  was  the  stimulus  for  the 
preparation  of  this  essay. 

In  the  remainder  of  this  section,  the  SPRR  title  will  be 
examined  word  by  word  to  set  out  some  of  the  themes 
of  this  paper,  which  is  not  a  summary  or  review  of  SPRR. 
I  do  not  assume  my  readers  have  read  SPRR. 

"Developmental"  is  the  key  word.  At  the  present  time, 
the  Census  Bureau  does  not  propose  a  method  that  would 
be  acceptable  for  modifying  census  counts.  The  hard  work  of 
SPRR  is  only  a  small  part  of  what  would  be  needed  to  have 
an  acceptable  census  data-modification  method.  SPRR  and 
this  essay  consider  future  steps  in  the  development  of  ac- 
ceptable methods.  Reading  SPRR,  one  cannot  tell  if  the 
authors  feel  disappointed  because  they  did  not  have  much 
success  in  finding  acceptable  methods  for  modifying  census 
counts.  (See  [7] ,  p.  105,  "Levels  of  Usage.") 

The  primary  task  of  SPRR  is  to  construct  "estimates  of 
the  coverage  of  the  population"  rather  than  "estimates  of  the 
population."  Thus  SPRR  examines  the  quality  of  the  1970 
census  in  contrast  to  modifying  the  census  results.  SPRR 
suggests  plausible  alternatives  to  the  census  counts  in  contrast 
to  modifications  which  could  replace  the  census  counts.  It 


is  awkward  to  give  a  good  analogy  to  the  situation,  so  I  will 
settle  for  several  paragraphs  of  discussion. 

SPRR  examines  in  detail  the  quality  of  the  1970  census. 
What  are  the  sizes  of  the  undercounts  for  individual  popula- 
tion segments?  The  emphasis  is  not  on  finding  an  alternative 
set  of  population  estimates.  You  could  work  in  the  spirit 
of  SPRR  in  the  following  situation:  A  set  of  dependent 
observations  have  been  regressed  on  a  set  of  independent 
variables.  You  are  given  the  least  squares  residuals  and  the 
values  of  the  independent  variables.  From  this  you  could 
learn  much  about  the  data  and  model.  You  could  detect 
outliers,  you  could  see  if  the  errors  are  relatively  uncor- 
rected or  if  there  are  some  built-in  dependencies,  and  you 
could  locate  regions  in  which  the  model  works  well  and 
where  it  works  poorly.  But  you  cannot  sketch  the  response 
surface  or  predict  the  response  for  a  given  set  of  independent 
variables.  If  you  were  interested  in  the  measurement  process- 
possibly  to  improve  it— rather  than  in  the  measurements, 
this  situation  might  be  satisfying. 

When  a  set  of  data  are  examined,  it  is  often  possible  to 
spot  errors,  but  how  to  correct  the  errors  is  not  always 
evident.  The  recent  paper  by  Lindley,  et  al.  [3]  discusses 
this  problem  in  detail  for  the  situation  where  a  person  pre- 
sents personal  probabilities  that  are  not  consistent  with  the 
laws  of  probability.  For  example,  what  should  be  done  when 
a  person  gives  P(A)  =  0.3  and  P(A)  =  0.5?  Clearly,  at  least 
one  of  the  probabilities  should  be  increased  by  0.1,  but  the 
rational  correction  is  not  evident.  Likewise,  errors  can  be 
detected  in  the  demographic  data.  There  was  an  excess  of 
30  percent  blacks  between  ages  25  to  34  in  the  1970  census 
who  said  they  were  born  in  the  Northeast  compared  to  what 
birth  and  death  records  indicated.  Although  the  evidence  in- 
dicates that  most  of  the  error  is  in  the  State-of-birth  data, 
the  situation  does  not  spell  out  the  precise  allocation  of  the 
error.  Thus  SPRR  emphasizes  the  sources  of  trouble  and 
plausible  sizes  of  errors  without  attempting  to  correct  the 
errors. 

"Population"  is  of  particular  concern,  but  other  variables 
have  measurement  problems;  it  is  very  difficult  to  measure 
income,  which  plays  an  important  role  in  the  allocation 
formula  for  general  revenue  sharing.  Attention  is  fixed  on 
population  because  so  much  is  known  about  it;  it  appears 


'  For  expository  purposes,  this  document  will  discuss  rates  for  two 
races,  "black"  and  "white."  The  paragraph  between  equations  (18) 
and  (19)  indicates  possible  meanings  for  "black"  and  "white."  In 
applications,  more  than  two  races  are  required. 


Work  sponsored  by  the  National  Science  Foundation 
under  Grants  7810496-SOC  and  7818166-SOC. 
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in  most  uses  of  census  data,  it  is  a  relatively  easy  concept  to 
understand,2  and  it  is  relatively  accepted  as  something 
appropriate  for  the  Government  to  measure. 

The  data  and  methodological  resources  of  SPRR  are 
strained  even  to  estimate  the  omission  rates  for  the  "States." 
Even  larger  correction  rates  are  likely  to  be  needed  at  the 
sub-State  level.  Work  at  that  level  awaits  new  data,  new 
methodology,  and  new  money. 

SPRR's  concentration  on  the  1970  census  is  appropriate 
in  terms  of  available  data.  However,  one  census  does  not 
provide  a  sound  foundation  for  the  development  and  valida- 
tion methodology.  It  might  be  enlightening  to  apply  the 
SPRR  methodology  to  new  situations,  such  as  the  1960  or 
the  1980  censuses.  This  is  an  awkward  task,  since  the  set  of 
data  associated  with  each  census  has  substantial  differences 
from  the  other  censuses. 

SPRR  discusses  several  methods  for  estimating  omission 
rates.  Its  emphasis  is  on  "demographic  analysis,"  an  approach 
that  does  not  emphasize  "random,"  as  in  "random  error." 
Thus,  for  SPRR,  "estimates"  refer  to  plausible  bounds  on 
errors  rather  than  to  standard  statistical  concepts  such  as 
confidence  intervals. 

Estimates  of  population  sizes  from  demographic  analysis 
would  be  hard  to  accept.  The  estimates  do  not  make  direct 
use  of  the  census  counts.  In  fact,  the  estimates  are  from 
independent  noncensus  data;  they  are  replacements  rather 
than  modifications.  Further,  SPRR  does  not  offer  hope  of 
ever  directly  checking  the  quality  of  the  estimates  from 
demographic  analysis.  But  estimates  that  cannot  be  verified 
are  suspect. 

I  fear  my  remarks  about  SPRR  will  leave  the  wrong  im- 
pression. My  criticism  is  in  large  part  of  my  wish  to  see  us 
move  further  along  into  this  important  area.  The  SPRR 
authors  are  to  be  commended.  They  have  done  a  large  task 
with  great  care  and  ingenuity,  and  they  have  maintained  a 
cautionary    position    in  spite  of  pressure  to  do  otherwise. 

CENSUS  DATA  ARE  NOT  PERFECT 

In  1970,  there  were  more  than  200  million  people  to  be 
counted  by  the  U.S.  census.  About  2.5  percent  of  the  popu- 
lation were  omitted— 1.9  percent  of  the  whites  and  7.7  per- 
cent of  the  blacks.  SPRR's  task  is  made  more  difficult  because 
of  lack  of  perfection  in  the  data  from  counted  individuals. 
Thus,  State  of  birth  is  used  as  a  foundation  for  SPRR's 
demographic  analysis;  3.9  percent,  or  3,957,000  whites 
under  age  35,  did  not  report  a  State  of  birth,  and  for  blacks 
residing  in  the  Northeast  who  were  between  ages  25  and  34, 
the  State  of  birth  was  given  in  error  about  30  percent  of  the 
time.  The  SPRR  analysis  of  State-of-birth  data  should  have 
independent  interest  to  scholars  and  the  public. 


SPRR  examines  the  data  in  detail.  Thus,  it  seems  un- 
likely that  additional  major  problems  in  the  data  remain  to 
be  discovered.  One  topic  that  the  public  might  feel  does  not 
receive  adequate  attention  is  that  of  illegal  aliens.  If,  as 
some  believe,  illegal  aliens  constitute  an  appreciable  propor- 
tion of  some  local  populations,  then  their  omission  in  the 
SPRR  analysis  could  be  serious.  (See  [1]  and  [9],  which 
explicitly  recognize  the  problem.)  It  is  not  clear  if  the  authors 
of  SPRR  have  considered  this  problem  and  made  provision 
for  it.  The  Census  Bureau  uses  a  variety  of  techniques  to 
impute  the  existence  of  some  people  and  characteristics  of 
some  other  people;  the  SPRR  discussion  does  not  overtly 
include  this  topic. 

NOTATION  AND  STATISTICAL  FRAMEWORK 

Before  attempting  to  discuss  the  needs  for  modification 
of  population  counts  and  the  associated  problems,  it  is 
efficient  to  develop  some  notation.  Basic  relationships  be- 
tween the  defined  terms  will  be  derived  for  their  immediate 
interest  and  future  needs.  The  beginnings  of  a  statistical 
framework  will  appear. 

To  avoid  excessive  complexity  and  wordiness,  concepts 
will  be  introduced  that  do  not  exactly  match  census  prac- 
tice. In  particular,  the  population  consists  of  two  races,  blacks 
and  whites.  Symbols  such  as  b,  j3,  B,  will  be  used  for  ob- 
served or  computed  numbers  and  proportions  of  blacks; 
w,  co,  W,  for  whites,  and  p,  it,  P,  for  total  populations.  The 
observed  or  computed  value  of  a  count  or  proportion  will 
typically  not  be  the  same  as  the  target  value  corresponding 
to  the  definitions  and  theory  underlying  the  observation 
process.  These  target  values  or  parameters  will  be  indicated 
by  an  arrow  overscore,  for  example,  w.  The  number  of  re- 
gions, such  as  the  50  States  and  the  District  of  Columbia 
or  the  four  census  regions  of  the  United  States  will  be 
represented  by  n.  The  regions  could  more  broadly  be  taken 
as  categories  like  age  groups  or  combinations  of  geographic 
and  economic  groups.  The  important  point  is  that  they  form 
a  partition  of  the  population.  Subscripts  will  identify  regions. 
Symbols  without  subscripts  will  represent  totals  or  averages; 
the  context  will  make  clear  which.  The  term  "States"  will  be 
used  instead  of  "regions"  for  concreteness. 

Let  b-  (w.)  be  the  census  count  of  blacks  (whites)  in  the 
/th  State.  Even  "the  census  count"  is  an  awkward  term  to 
define  in  practice  (see  comments  below,  regarding  displays 
(1)  through  (4));  here  it  is  assumed  that  the  definition  has 
been  specified  and  that  the  counts  are  available.  The  census 
count  of  the  population  in  the  /'th  State  is 


p.  =  bi+wi 


The  national  counts  are 


2  The  technical  and  legal  definition  of  population  is  complex  and 
subject  to  controversy.  See  Robert  Reinhold,  "Dispute  Over  Aliens 
Snarls  Census  Plans,"  New  York  Times,  December  21,  1979,  pp.  A1 
and  A24. 


b  =  Zbf 

1 


d: 


(2) 
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w=Siv- 


and 


p  =  2  p  ■  =  b  +  w  =  2  {b  ■  +  w-) 


(3) 


(4) 


Displays  (1)  through  (4)  contain  some  very  acceptable 
consistency  relationships,  such  as  the  total  census  count  is 
the  sum  of  the  black  census  count  and  the  white  census 
count.  In  practice,  this  might  not  be  the  case  because  race 
was  not  determined  for  all  those  counted,  or  the  total  count 
might  be  published  before  racial  counts;  hence,  the  racial 
counts  might  contain  records  obtained  after  the  total  count 
was  given.  The  U.S.  census  and  SPRR  attempt  to  maintain 
such  consistency,  and  SPRR  requires  the  same  consistency  in 
its  modifications.  These  consistency  relations  will  be  used  in 
our  analysis.  Some  of  the  detail  is  given  in  (8)  through  (17). 

Let  upper  case  letters  represent  possible  modifications 
of  the  census  counts  of  the  corresponding  lower  case  letters; 
bj  becomes  B-  w  becomes  W,  etc.  The  relationship  between 
upper  and  lower  case  symbols  can  be  expressed  in  terms  of 
omission  rates  in  the  following  ways: 


B.=  (1+Pi)b.,orp. 


Brbi 


W.-w. 
W,  =  (1  +  co)w. ,  or  co .  =  _ L 


i'    i '  i 


(5) 


(6) 


wi 


and 


Pj=  (1  +TT/)Pi  ,  Or  TTj 


_B,-P, 


(7) 


The  selected  modification  process  will  generate  the  B- 
and  W-,  which  will  determine  the  B-,  co-,  and  it..  In  SPRR,  the 
B-  and  W .  are  based  on  data  that  ideally  are  not  related  to  the 
current  census.3  Thus,  such  expressions  as  B-  =  (1  +  j3 -)/? - 
should  not  be  interpreted  in  the  SPRR  analysis  as  correct— 
the  black  State  count  to  yield  the  modified  black  State 
count.4  Rather,  (1  +  fl.)  is  the  ratio  of  two  independent 
estimates  of  the  black  population  in  State  /.  The  basic  task 
of  SPRR  is  the  development  of  the  estimates  B-  and  W-. 
It  gives  several  related  estimates.  Notice  the  intent  is 
Bj  =  b;  etc.  That  is,  the  census  and  the  modification  process 
have  the  same  target  population.  The  above  relationships  are 
not  independent,  since  P-  =  B-  +  W- ,  i.e., 


(1  +ir/lpf=^  +j3/)b/+(1  +Uj)Wj 


(8) 


3  In  fact,  the  current  census  is  used  by  SPRR  for  migration  data. 

4  It  is,  however,  natural  to  speak  of  estimates  of  (3.  ,  etc.  In  applica- 
tions of  the  SPRR  and  related  methodologies,  one  is  likely  to  esti- 
mate these  rates  for  population  strata  that  are  finer  than  those  for 
which  demographic  analysis  provided  alternative  counts. 


Then,  from  (1), 


Jjbj+COjW; 

TT;-  

bj+Wj 


For  the  considered  procedures,  B  =  2  B  ■  or 


(9) 


Now  use  (2) 


(1+13)6  =  2(1  +p.)b. 


2  6.. 


(10) 


(11) 


Also, 


M/=2M/: 


co  = 


2  cow. 
2  w  j 


P  =  2  P.. 


77  = 


2  TTjP, 


2p,- 
P  =  B  +  W 


and 


IT  = 


jib  +  cow 
b  +  w 


(12) 

(13) 
(14) 

(15) 

(16) 

(17) 


It  is  important  to  verify  that  (15)  and  (17)  are  consistent, 
j36  +  cow       2  B-b-  +  2  cOjW.       2  TTjP- 


77  = 


b  +  w 


2  bj  +  2  Wj 


*Pj 


where  the  equalities  are  from  (17),  (11  and  13),  (9),  and  (15). 
In  the  following  analysis,  it  is  assumed  that  the  omission 
rates  for  the  United  States  as  a  whole  are  correct,  that  is, 
P  =  P  and  n  =  it,  so  that  P  =  (1  +  7r)p  could  be  written  as 
P  =  (1  +  7r)p.  Similar  remarks  apply  to  B  and  W.  Unless 
necessary,  the  arrows  will  not  be  used.  The  following 
numerical  values  from  Census,  [5]  are  assumed: 


rr  =  0.025,  B  =  0.077,  and  co  =  0.01 9 


(18) 


Since  the  census  counts  {b-,  w-}  are  known,  the  values  of 
b,  w,  p,  B,  W,  and  P  are  known.  The  problem  is  to  estimate 
the  {  Bj,  coj }  and  the  { B.,  Wj,  P} } . 

To  show  a  possible  use  of  this  model,  consider  the  "basic" 
modification  process.   (This  is  considered  again  in  sections 
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7  and  8,  but  it  is  not  a  procedure  seriously  considered 
by  SPRR.)  In  the  basic  procedure  for  each  State,  one  de- 
fines B-  =  0.077  and  cj-  =  0.019.  Then  one  can  compute 
P.  =  1.077  b-  +  1.019  w-,  and  from  (9)  the  n-  can  be  com- 
puted. As  examples,  this  process  yields  omission  rates  of 
0.019  for  South  Dakota  and  0.041  for  Mississippi.5  The  basic 
method  used  here  is  closely  related  to  the  synthetic  basic 
of  [6] .  That  report  considers  three  races,  and  in  table  1  the 
omission  rate  for  the  third  race  is  0.  Thus,  the  table  has 
omission  rates  less  than  0.019!  SPRR  (p.  5,  no.  13)  works 
with  two  races:  White  and  black  and  other  races.  Table 
Vll-G,  based  on  the  Current  Population  Survey,  considers 
whites  and  other  races  and  blacks.  (The  heading  of  this  table 
mentions  an  adjustment  for  imputation  in  the  census.) 
The  estimated  omission  in  State  /  is 


def 


A.  -PrPrnipr^bi+w.wi 


The  error  rate  is 


def  A.       A-  p. 

/?.     =      —I     =    —L 1.     =   Tf; 


pi 


P,  P; 


1    +7T; 


(19) 


(20) 


Usually  the  B-,  to-,  and  n.  are  positive,  and  their  absolute 
values  seldom  exceed  0.1.  Notice  R-  is  computed  with  the 
modified  count  in  the  denominator,  while  n.  has  the  census 
count  in  the  denominator.  With  the  presumption  that  the 
modified  count  is  better  than  the  census  count,  the  R-  is 
a  better  measure  of  the  error  rate  than  7r  •;  /?  •  and  n.  seldom 
differ  by  much  and  they  always  have  the  same  algebraic 
sign.  The  analysis  could  be  developed  in  terms  of  n-  or  R-. 
The  use  of  it-  makes  the  expressions  for  going  from  counts 
to  modification  relatively  simple;  R  would  be  useful  in 
going  from  modification  to  counts.  When  0  <  it-,  then 


nrirl 


2  <R.<7l: 


(21) 


To  this  point,  the  notation  has  been  purely  descriptive 
and  did  not  have  any  statistical  framework.  In  the  next  para- 
graph, notation  is  introduced  that  would  be  useful  in  the 
statistical  analysis  of  the  modification  process.  In  the  appen- 
dix to  this  section  (appendix  A),  several  probabilistic  models 
are  given  that  describe  in  part  the  modification  process. 
Those  models  are  not  presented  in  the  body  of  this  section 
because  they  are  artificial  and  would  be  inappropriate  to 
apply  to  data.  The  models  illustrate  types  of  reasoning  and 
the  existence  of  consistent  mathematical  structures  to  des- 
cribe the  modification  process.  The  purpose  of  this  essay 
is  not  to  construct  useful  modification  procedures.  The 
statistical  models  in  later  sections  have  elements  of  im- 
mediate interest.  The  models  in  the  appendix  might  be  useful 
in  Monte  Carlo  studies  of  the  modification  process.6 


5  The  basic  estimates  cannot  yield  tt  <  0.019. 

'For  some  situations,  realistic  stocnastic  models  for  census  modi- 
fication are  available.  See  the  paper  by  Ivan  Fellegi  in  these  pro- 
ceedings. 


For  each  computed  or  observed  quantity  such  as  B-  or 
p.,  there  can  be  a  corresponding  random  variable  designated 
by  "*"  such  as  B*  and  p*.  A  random  variable  such  asp* 
can  incorporate  many  different  aspects  of  the  data  problem. 
Before  a  census  is  conducted,  p*  and  its  distribution  func- 
tion summarize  our  knowledge  of  the  count  to  be  obtained 
of  the  population.  The  distribution  is  based  on  available 
demographic  data  and  the  planned  performance  character- 
istics of  the  forthcoming  census.  The  distribution  of  B* 
after  the  census  has  been  performed  would  depend  on  how 
the  census  process  appeared  to  operate  and  on  the  used 
modification  procedures.  In  some  circumstances,  it  is  neces- 
sary to  use  a  strong  subjective  component  in  evaluating  these 
distributions,  while  in  others,  it  will  appear  that  ample 
frequency  information  is  available.  For  example,  if  the 
States  all  have  the  same  distribution  for  B*,  and  this  distri- 
bution does  not  change  between  censuses,  there  is  likely  to 
be  agreement  on  the  common  distribution  of  B*. 

The  emphasis  here  is  not  on  finding  these  distribution 
functions  but  to  point  out  their  existence.  As  a  corollary, 
because  the  modification  process  is  in  a  probability  frame- 
work, it  will  be  possible  and  appropriate  to  subject  the  modi- 
fication process  to  statistical  analysis. 

The  hard  work— not  begun  here— includes: 

1.  Selecting  an  adequate  philosophical  base.  The  material 
at  hand  does  not  readily  respond  to  a  naive  frequentist 
approach.  As  in  many  social  problems,  it  appears  that 
the  entire  population  is  observed;  that  is,  the  fixed 
population  of  States.  This  can  be  circumvented  by  such 
devices  as  (1)  thinking  of  "the  States"  as  a  sample 
from  a  hypothetical  population  of  "States"  or  (b) 
thinking  of  sub-State  units  as  the  basic  random  ele- 
ments. Since  each  census  has  so  many  special  features, 
it  is  not  useful  to  think  of  the  population  of  U.S. 
censuses.  It  is  useful  to  think  of  the  populations  of 
special  census,  census  checks,  etc. 

2.  The  choice  of  philosophical  base  will  be  intimately 
woven  into  the  kinds  of  models  that  are  found  appro- 
priate to  describe  the  phenomenon  and  the  kinds  of  de- 
cisions and  inferences  for  which  the  models  are  made. 
This  work  is  also  in  the  developmental  stage  [10]. 

Before  closing  these  general  remarks,  note  that  one  might 
need  the  joint  distribution  of  several  random  variables,  such 
as  B-,  B-,,  and  co. .  For  the  analysis  in  this  paper,  only  the  low 
moments  of  random  variables  will  be  used.  Thus,  the  most 
complex  item  of  interest  would  be  a  covariance. 

NEED  FOR  MODIFICATION 

The  State  omission  rates,  when  national  values  are  applied 
at  the  State  level,  were  indicated  in  (18).  SPRR  seriously 
consider  other  assumptions  that  generate  error  rates  as  large 
as  9  percent,  and  some  of  their  extreme  assumptions  result 
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in  rates  over  1 1  percent.  (The  States  that  generate  extreme 
values  are  Alaska  and  New  Mexico;  SPRR  considers  features 
other  than  the  white-black  dichotomy.)  It  is  hard  to  tell  if 
these  percentages  are  important.  To  begin  to  understand 
their  significance,  it  is  useful  to  look  at  how  population  is 
used. 

A  small  change  in  the  population  of  a  State  can  change 
by  one  the  size  of  its  congressional  delegation.  It  appears 
to  take  a  change  of  about  470,000  for  the  next  seat  to  be 
gained  or  lost.  California,  with  a  population  of  2  x  107  and 
with  omission  rate  in  the  range  2.8  to  4.7  percent  [7,  table 
Vlll-D],  has  an  omitted  population  between  5.6  x10s 
and  8.8  x  10s .  So  California  would  appear  underrepresented 
by  at  least  one  seat.  But  this  analysis  is  too  simple.  If  the 
California  count  is  to  be  corrected,  then  the  other  State 
counts  must  be  corrected.  In  fact,  when  the  adjustments 
of  SPRR  are  made,  changes  in  the  sizes  of  the  delegations 
are  relatively  hard  to  predict.  Oklahoma  loses  a  seat  for 
most  of  the  proposed  ways  of  modifying  the  count.  Cur- 
rently, Oklahoma  has  six  Members  of  Congress,  the  average 
size  of  a  district  is  quite  small  (427,000),  and  the  omission 
rate  is  small.  None  of  the  modifications  that  have  been 
analyzed  by  SPRR  resulted  in  a  change  of  a  State's  delega- 
tion by  more  than  one  seat  or  in  a  change  for  the  California 
delegation  (but,  see  [6] ,  p.  13). 

The  consequences  of  census  omissions  for  within-State 
apportionment  have  not  been  extensively  studied  [6] .  Al- 
though modification  of  census  counts  has  not  been  con- 
sidered as  a  part  of  the  apportionment  process,  the  con- 
sequences of  omission  are  required  to  know  how  well  the 
Supreme     Court     mandate— one-man-one-vote— is    satisifed. 

At  this  time,  the  monetary  value  of  public  data  of  vary- 
ing quality  has  not  been  approximated.  The  consequences  of 
varying  the  quality  of  public  data— such  as  the  above  work  on 
apportionment— would  serve  as  guides  to  the  needs.  Policy 
analysts,  legislators,  and  special -interest  groups  should  find 
such  studies  useful. 

Modified  population  counts  are  likely  to  be  used  in  the 
formula  allocation  of  funds.  The  formulas  in  current  use  at 
the  Federal  level  are  complex.  Here,  we  briefly  consider  three 
simple  allocation  processes. 

Application.  Each  individual  applies  for  his  appropriate 
amount.  Food  stamps  would  be  an  example.  "Individual" 
could  be  a  school  district,  town,  etc.  In  this  allocation 
process,  the  individual  must  be  able  to  supply  data  of  accep- 
table quality.  The  Government  needs  national  data  for 
budgeting,  and  some  local  data  of  limited  quality  are  needed 
for  setting  standards  and  administration.  Application  has  dis- 
advantages, such  as  personal  and  administrative  costs  of  the 
applications.  When  the  individuals  are  governments,  there 
could  be  substantial  local  needs  of  the  kinds  of  data  now 
provided  by  the  national  Government. 

Open  ended.  Assume  the  Government  wishes  to  distribute 


to  each  State  (or  other  unit  of  government)  a  fixed  amount 
of  money  per  individual  in  the  State,  say  $A  Certain  educa- 
tion funds  are  distributed  to  selected  portions  of  the  popu- 
lation in  this  manner. 

In  the  open-ended  allocation,  if  the  counts  are  not  modi- 
fied, the  /th  State  will  receive 


Ap. 


(22) 


and  the  total  allocation  will  be  Ap.  Modifications  are  made 
in  the  following  manner:  Estimate  the  omission  rates— say, 
/3-  and  co  —  and  then  compute  the  modified  counts 


W.=  w.  (1  +  cj.)  and  B.  =  b.  (1  +0.) 
the  modified  State7  population  \sP.=  B.+  W.. 


(23) 


Thus,  with  the  modification,  the  /th  State  will  receive 


AP: 


and  the  total  allocation  will  be 


AP  =  AEP. 


(24) 


(25) 


The  change  in  allocation  to  the  /th  State,  as  a  result  of  modi- 
fication is 


AL^AiWjUj+bjVj) 

and  the  total  change  is 


AA 


(26) 


(27) 


where  A  =  np 


For  later  reference,  we  make  the  following  observations. 
The  change  in  an  allocation  is  A  times  the  corresponding 
population  modification.  Now  using  the  mode  of  thought 
begun  earlier,  one  obtains:  (1)  The  "average  change"  or 
"standard  deviation  of  change"  of  allocation  is  A  times  the 
corresponding  figure  for  population.  (2)  The  "correla- 
tion" between  modifications  in  population  and  allocation  is 
1.  The  relative  changes  in  population  and  allocation  are  iden- 
tical. 

This  allocation  process  is  attractive:  (1)  It  is  simple  to 
apply  and  to  explain.  (2)  In  most  situations  when  modifica- 
tions are  made,  each  State  will  receive  an  increased  allocation. 
(3)  Further,  if  the  Government  replaces  A  by  0.975/4,  the 
open-end  feature  does  not  result  in  a  total  allocation  larger 
than  the  one  planned,  $-4. 

The  amount  of  money  involved  in  the  changes  can  be 
substantial.    If  the  total  allocation  is  $5  x   109,  then  the 


7 Recall,  the  "States"  refers  to  any  partition  of  the  population. 
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amount  involved  in  the  changes  is  about  0.025  x  5  x  109 
=  1.25  x  108.  This  amount  would  be  visible  to  the  States. 
Hence,  there  would  be  tremendous  pressure  to  make  changes 
and  the  method  of  change  would  be  hotly  discussed. 

The  process  outlined  in  this  section  would  in  practice 
be  quite  complicated.  Instead  of  50  States,  the  process  could 
be  applied  to  the  large  number  of  units  in  general  revenue 
sharing.  In  addition  to  race,  modifications  of  population 
values  could  use  other  variables  such  as  age  or  income.  The 
modification  process  can  involve  data  from  many  sources 
and  need  not  be  the  same  for  all  areas. 

Fixed  pie.  The  Government  allocates  $>4'to  the  States 
in  proportion  to  their  populations.  Thus,  State  /  receives 


A' 


w.  +  b. 

i       i 

E(w.+bj) 


(28) 


If  the  counts  are  modified  with  the  same  method  used  for 
open-ended  formulas,  the  modified  allocation  to  State  /  is 


A' 


wf.(1  +  co/)+fc/(1  +/3/) 


S(w-{1  +c*j)+b.  (1+jJ.)) 


(29) 


The  total  allocation  remains  at  $-4'. 

The  change  in  allocation  to  the  /th  State  as  a  result  of  the 
modification  process  is: 


A' 


P    p 


=  A' 


P/(1+ir#.)       p 


p  (1  +n) 


P 


71". - 
/ 

77 

1  + 

TT 

(30) 


Some  of  the  changes  will  be  positive  and  some  negative.  The 
total  change  is  0. 

When  the  total  allocations,  open-ended  and  fixed-pie, 
are  approximately  equal,  say,  Ap  =  A'  =  AP,  the  changes  be- 
tween open-ended  and  fixed-pie  should  be  compared.  The 
following  quotation  will  guide  the  analysis  [7,  pp.  106- 
107]: 

We  have  compared  the  distribution  of  $1  billion  among 
the  States  on  the  basis  of  the  census  counts  and  the  dis- 
tribution of  $1  billion  among  the  States  on  the  basis  of 
three  sets  of  corrected  population  figures  (table  Vlll-A). 
The  illustrative  comparison  shows  that  the  size  and  varia- 
tion of  the  percentage  shifts  in  the  funds  apportioned 
among  the  States  on  the  basis  of  the  corrected  population, 
as  compared  with  the  distribution  on  the  basis  of  the 
census  population,  are  much  smaller  than  the  size  and 
variation  of  the  rates  of  underenumeration  (that  is,  per- 
centage shifts  in  population  among  the  States)  used  to 
correct  the  population. 

In  general,  simple  apportionment  formulas  dampen  con- 
siderably the  effect  of  any  variable  adjustment  of  a  set 
of  data. 


The  analysis  places  these  remarks  in  a  formal  framework. 
Write  the  final  expression  in  (30)  in  the  form: 


A'L. 

P 


IT     -  TT 
1   +7T 


(31) 


Here,  pf.  and  irj  have  been  replaced  by  the  random  variables 
p*  and  n*.  Not  using  the  subscript  /  indicates  that  one  State 
is  much  like  another  from  the  viewpoint  of  this  analysis  of 
errors  in  the  counts  and  in  the  modification  process.  The 
unsubscripted  variables  without  *'s  refer  to  the  national 
value  while  *  refers  to  a  State. 

The  following  plausible  assumptions  will  be  used: 


1.  p*  and  77*  are  independent. 
.1 


2.  ££.*=  2 
P 


?lY<0. 


3.  Ett*  =  n  (=0.025),  and  7r*  is  approximately  normal  with 
variance  o2 .  Since  7r*  never  exceeds  0.1,  a2  <  0.01 
and  0.0001  appears  as  a  typical  value  for  a2 . 

Under  these  assumptions, 

EA'P*\ll_IL\   =  /T  <££_*)  (.8)  _£_   <A'(A)2      (32) 

^        I  1  +7T    |  p  1  +77    ~ 

Now,  with  the  same  assumptions,  consider  (26)  in  the  form: 


so  that 


Ap*  tt' 


E  Ap*  TT*  =  A(EP*)  TT 


(33) 


(34) 


The  ratio  of  the  expected  change  with  open  ended  to  the  ex- 
pected absolute  change  with  fixed  pie  is: 


A(Ep*)TT 


A'  &£-)  (.8)  -2— 

P  1  +TT 


(35) 


Assume  A'  =  AP,  or  equivalently,  A'  =  AP  (1  +  7r).  Thus, 
(35)  becomes 


77 


(.8)  a 


(36) 


(In   these   computations,  no  distortion  of  practical   results 
will  occur  due  to  the  use  of  Ett*  instead  of  E  \tt*\.) 

The  fraction  in  (36)  is  unbounded  above,  since  a  =  0  is 
a  possible  modification  process.  Values  of  2  or  3  for  (36) 
seem  likely.  The  (36)  ratio  does  give  a  quantitative  expres- 
sion for  some  of  the  above  quoted  remarks  of  SPRR.  The 
minimum  value  of  the  fraction  is 


1 


{.8)Vn-1 


(37) 
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when  all  the  n-'s  =  0,  with  the  exception  that  one  of  them 
equals  mr,  and  the  probability  fiat  a  particular  State  is 
selected  is  -1/n.  This  is  an  extreme  case,  but  it  indicates  the 
SPRR  conclusion  that  size  of  shift  is  not  universal.  The 
condition  A'  =  Ap  (1  +  tt)  results  in  equal  total  allocations 
by  both  processes. 

Next,  we  compare  the  variances  of  the  shifts  in  allocations. 
From  (26),  for  the  open-ended  process,  the  relevant  random 
variable  is  Ap*  n* ,  and  from  (30),  for  the  fixed-pie  process, 
the  relevant  random  variable  is  Ap*  (7r*-7r).  We  first  compute 

-^  j[\/  (Ap*n*)]  -  V[Ap*(n*-n)]\=  E(p*n*  -  Ep*n*)2 
-  E  [(p*n*~  Ep*7i*)  -(p*7i-nEp*)]2 

=  2n  E{p*n*  -  Ep*n*)  (p*  -  Ep*)  -  n2  V(p*)  (38) 

If  p*  and  n*  are  independent,  this  reduces  to  -tt2  V{p*). 
In  this  case, 


V(Ap*n*)  -  V[Ap*  (tt*-  tt)] 


[E(AP*tt*)] 


*_*\1  2 


V(P*) 
(Ep 


*\2 


Further,  the  difference  (38)  could  be  negative  when  the  cor- 
relation between  p*n*  and  p*  is  small  positive  or  negative. 
This  is  in  disagreement  with  SPRR;  the  dampening  effect 
of  fixed  pie  is  less  than  open  ended. 

In  thinking  of  the  political  interpretation  of  modification, 
it  is  natural  to  compare  States.  A  simple  measure  of  this  com- 
parison is 


(change  in  allocationX       /change  in  allocation\ 
original  allocation  /■      \  original  allocation   /■ 


(39) 


When  multiplied  by  100,  this  is  the  difference  in  percentage 
change  in  allocation  between  the  States.  If  this  formula  is 
computed  for  the  open-ended  allocation,  one  obtains 

AAi  AAj  Ai  Aj 

A(wi  +  b.)     A(wj  +  bj)       Wj  +  bj       wj  +  bj       ni     nj  (40) 


And  if   (39)    is  computed  for  the  fixed-pie  allocation,  one 
obtains 


A'    '[   >      )       A'_±[J—\ 
p\\+irj  p\1+tt/ 


A* 


TT:-  TT 


A'"± 


1   +7T 


TT:-  TT 

1    +TT 


TTj-TTj 


1  +7T 


(41) 


Thus,   the   comparisons   between    States  of  the  percentage 


modification  in  allocations  are  practically  the  same  for  open- 
end  and  fixed-pie  methods.  If  politicians  look  at  the  problem 
in  terms  of  relative  advantage,  the  dampening  effect  from 
fixed  pie  versus  open  end  will  be  negligible. 

The  real  advantage  of  fixed  pie  appears  to  stem  from  it 
sometimes  yielding  smaller  changes.  If  comparisons  between 
States  are  not  made  because  the  changes  appear  small,  then 
the  fixed-pie  allocation  will  cool  the  political  forces.  Actually, 
much  of  the  discussion  of  modification  of  allocation  centers 
on  equity.  It  is  not  entirely  clear  what  "equity"  means,  but 
most  usage  of  this  term  will  involve  the  kinds  of  comparisons 
suggested  above. 

The  strongest  apparent  argument  for  data  modification 
is  that  we  are  sure  the  census  data  contain  errors.  Large 
sums  of  money  are  allocated  with  the  use  of  census  data. 
Since  the  data  have  errors,  the  allocations  have  errors.  Hence 
we  should  remove  the  errors.  Some  comments  on  this  argu- 
ment are: 

1.  It  is  not  obvious  that  we  can  much  improve  on  the 
census  data  [7,  p.  2] . 

2.  If  the  counts  are  not  corrected,  those  communities  that 
have  done  their  civic  duties  well  are  at  an  advantage. 
Not  correcting  could  encourage  people  to  be  counted 
(twice!);  correcting  could  decrease  the  incentive  to  be 
counted.  The  Constitution  might  intend  to  use  this 
device  to  help  the  census,  see  comments  in  [2] . 

3.  It  is  likely  that  different  counts  will  be  used  for  differ- 
ent purposes,  such  as  allocations  and  apportionment. 
This  could  result  in  loss  of  confidence  in  the  Census 
Bureau  for  all  activities. 

4.  Any  corrections  made  would  not  be  perfect,  and  there 
is  no  consensus  on  which  distribution  of  errors  would 
be  most  equitable  [7,  p.  107] . 

METHODOLOGY:  GENERAL  COMMENTS 

The  separation  of  demography  and  statistics  is  well  recog- 
nized. A  conference  was  held  at  the  beginning  of  the  1970's 
at  the  East-West  Center  of  the  University  of  Hawaii  to  ex- 
plore the  reasons  for  this  lack  of  interaction.  There  is  evi- 
dence that  the  conference  had  little  effect.  SPRR  makes 
limited  use  of  statistical  reasoning.  Probabilistic  concepts 
are  not  used  either  to  describe  "errors"  or  to  assess  the 
quality  of  "estimates."  The  SPRR  analysis  has  two  specific 
aspects  that  are  particularly  puzzling  to  me;  I  am  not  sure 
whether  the  puzzle  arises  in  my  role  as  a  statistician  or  as  a 
general  observer.  To  proceed,  a  few  comments  will  be  made 
on  the  subjects  of  this  paragraph  and  then  more  specific 
comments  will  be  made  on  the  methodology. 

Begin  with  a  puzzle.  Since  the  1970  census  contains 
errors,  if  we  want  to  improve  on  the  census,  our  modifica- 
tions should  (not)  depend  on  the  counts  in  the  1970  census. 
SPRR  [7,  p.  3]  insists  on  "should  not"; in  chapter  VII I,  words 
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like  "correct"  and  "adjustment"  mean  "replace  the  census 
count  with  an  independent  estimate."  The  SPR  R  "estimates" 
are  formed  without  using  the  1970  counts.  SPRR  would 
prefer  no  use  of  census  data,  but  the  census  was  the  only 
useful  source  of  migration  data. 

I  think  every  effort  should  have  been  made  to  use  the 
1970  counts;8  otherwise  one  throws  the  baby  out  with  the 
wash.  When  one  has  a  measurement  process  that  contains 
random  errors  and  biases,  the  usual  practice  is  to  calibrate 
the  process  with  alternative  information;  only  in  extreme 
cases  do  you  throw  out  the  original  process.  (It  is  important 
to  recall  that  the  vast  bulk  of  SPRR  is  devoted  to  assessing 
the  errors  in  the  census;  SPRR  is  not  concerned  with  "ad- 
justments" of  the  census.) 

The  second  puzzle,  closely  related  to  the  first,  is:  There 
is  (not)  a  way  to  calibrate  the  current  (1970)  census  counts. 
SPRR  insists  on  "is  not."  (See  [7],  p.  3,  "Ultimate  untesta- 
bility";  p.  9,  right-hand  column,  "indirect";  p.  20,  "cannot 
be  answered";  p.  24,  "arbitrarily";  p.  26,  "evidence  cannot  be 
found";  p.  40,  "metaphysical  quality";  pp.  40  and  83,  "un- 
testable  assumptions";  p.  94,  "lacking  any  formal  empirical 
basis";  and  p.  104,  left-hand  column,  paragraph  2, "no-way.") 
Part  of  this  negativism  corresponds  to  judgmental  decisions 
that  occurred  in  a  very  complex  analysis.  The  general  result 
of  the  SPRR  analysis  is  that  the  authors  are  working  on  a 
problem  where  predictions-hypotheses  cannot  be  tested. 
They  have  a  battery  of  internal  or  indirect  checks  on  the 
quality  of  their  estimates  [7,  p.  9,  right-hand  column] . 
They  have  put  their  developmental  effort  into  estimation  of 
census  "error,"  but  they  did  not  address  the  problem  of  ob- 
taining a  tested  method  of  adjusting  the  census. 

SPRR  presumably  felt  their  analysis  must  precede  the  cali- 
bration problem.  They  might  feel  that  the  calibration  prob- 
lem is  meaningless  or  impossible.  If  one  is  completely  nega- 
tive about  the  regularity  of  (social)  nature,  there  can  be  no 
science.  (SPRR  is  not  extreme  in  its  negativism;  it  makes 
much  use  of  life  tables.) 

SPRR  has  considerable  statistical  strength.  Results  regard- 
ing samples  are  well  handled.  It  does  not,  however,  have  a 
unifying  concept  of  "error,"  as  in  subjective  statistics.  A 
frequent  SPRR  interpretation  of  "error"  is  a  discrepancy 
between  two  proposed  values  for  the  same  numerical  concept. 
An  analogue  for  the  statistical  distribution  of  errors  is  the 
collection  of  "errors"  arising  from  all  of  the  proposed  values 
for  the  same  concept.  Although  the  analogy  is  somewhat  ex- 
ploited (for  example,  ranges  of  "errors"  are  considered), 
it  does  not  receive  an  overt  development.  It  is  my  impres- 
sion that  this  treatment  of  "error"  will  not  generate  a  satis- 
factory methodology.  One  sees  the  authors  of  SPRR  attempt- 


8  No  doubt  this  is  a  Catch-22  criticism  of  SPRR.  If  SPRR  had 
some  interest  in  giving  methods  to  be  used  for  census  modification, 
then  I  insist  on  the  argument.  If  SPRR  were  just  developing  tech- 
niques to  explore  the  coverage  rate,  then  I  would  think  they  are  too 
late  with  too  little. 


ing  to  describe  the  intervals  they  have  found  for  the  omis- 
sion rates  (pp.  91-92).  The  vocabulary  is  informal  but 
strives  to  be  quantitative:  "acceptable,"  "good,"  "ade- 
quate," "not  a  wholly  adequate,"  "inadequate,"  "too 
broad,"  intervals. 

Alone,  I  cannot  argue  the  following  position  in  detail  or 
with  great  success.  But  if  I  were  now  to  be  involved  in  a 
substantial  effort  to  modify  the  census,  my  basic  approach 
would  be  to  think  of  it  as  a  problem  in  statistical  calibration; 
I  would  try  to  use  the  current  census  data,  I  would  try  to 
have  a  broad  and  unified  concept  of  error,  and  I  would  strive 
for  testable  predictions. 

DEMOGRAPHIC  ANALYSIS 

In  demographic  analysis,  one  obtains  several  different 
methods  to  estimate  a  quantity.  If  the  estimates  differ  among 
themselves,  then  the  data  used  or  the  logic  behind  the  esti- 
mates are  not  perfect.  The  following  applies  to  those  situa- 
tions where  one  can  demonstrate  the  source  of  differences 
is  the  data  and  not  faulty  logic.  If  one  of  the  methods  is  a 
standard  or  is  known  to  yield  results  close  to  the  quantity 
being  estimated,  then  one  can  actually  pinpoint  which  data 
sets  are  in  error  and  the  sizes  of  the  errors. 

As  a  special  case  of  demographic  analysis,  consider  the 
number  of  white  people  under  age  35  on  April  1,  1970,  who 
should  have  been  counted  in  the  U.S.  census.  An  estimate  of 
this  quantity  would  be  one  of  the  published  census  results. 
Another  method  of  estimation  is  to  use  demographic  logic, 
that  is,  use  the  relevant  numbers  of  people  who  were  (1) 
born  in  the  United  States,  (2)  died  in  the  United  States, 
(3)  immigrated  to  the  United  States,  and  (4)  emigrated  from 
the  United  States.  These  figures  are  combined— (1)-  (2)+  (3) 
-  (4)— to  give  the  demographic  estimate  of  the  population 
size.  Of  course,  none  of  these  quantities,  (1),  (2),  (3),  (4), 
are  known  exactly.  From  other  sources,  it  is  known  that 
(1)  and  (2),  as  reported,  are  too  small,  and  hence  they  are 
"corrected."  The  properties  of  (3)  and  (4)  are  not  well 
known.  Nevertheless,  the  demographic  estimate  in  this  case 
is  considered  to  be  much  closer  to  the  true  population  size 
than  the  census  count.  Hence,  one  minus  the  census  count, 
divided  by  the  demographic  estimate  is  the  omission  rate  for 
the  white  population  under  35  in  1970.  This  near  ideal  of 
demographic  analysis  involves  a  substantial  amount  of  work. 
Other  applications  have  additional  complications,  and  it  is 
not  always  clear  where  the  errors  arise  nor  how  large  they 
are.  The  task  of  demographic  analysis  is  like  putting  a  jigsaw 
puzzle  together  when  the  pieces  have  been  worn  and  the 
picture  has  no  border.  From  the  final  puzzle,  the  size  of  the 
original  picture  will  not  be  clear. 

In  the  above  example,  the  two  estimates  are  based  on  en- 
tirely different  data  and  most  of  the  error  is  ascribed  to  one 
data  source.  Insofar  as  two  estimates  use  the  same  data  (in 
the  same  manner),  one  does  not  obtain  a  simple  view  of  the 
error  in  the  common  data. 
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In  this  simple  example,  the  obvious9  estimate  of  the 
national  count  is  the  demographic  estimate,  which  then  sug- 
gests for  this  purpose  the  census  is  not  useful!  When  the 
demographic  problem  becomes  more  complex,10  it  might 
be  useful  to  use  some  census  data— but  not  the  population 
counts— in  forming  the  estimates.  For  example,  to  estimate 
the  number  of  white  persons  in  each  State  under  age  35,  it 
is  useful  to  have  information  on  between-State  migration; 
such  data  are  available  from  the  census  question  on  "State 
of  birth."  The  base  of  demographic  analysis  is  bookkeeping 
or  conservation  of  humans  overtime.  Nonbookkeeping  tech- 
niques include  the  use  of  life  tables  or  sex  ratios  obtained 
from  one  population  applied  to  other  populations.  Census 
counts  are  used  in  some  parts  of  demographic  analysis;  for 
example,  if  the  size  of  a  cohort  increases  between  census 
(without  net  migration),  one  has  a  good  idea  of  the  location 
of  troubles.  Examples  of  this  kind  occur  for  the  black 
working-age  male. 

Demographic  analysis  does  not  emphasize  (1)  use  of  non- 
demographic  data,  such  as  macroeconomic  or  physical  data- 
number  of  cars,  etc.— or  social  data— number  of  imputations; 
(2)  use  of  relationships  other  than  conservation  (regression 
analysis  is  excluded);  and  (3)  predictions  and  hypotheses 
that  can  be  tested.  The  artificial  exclusion  of  data  and  stan- 
dard techniques  discourages  confidence  in  demographic 
analysis,  either  for  the  purpose  of  locating  errors  or  modify- 
ing census  counts.  The  lack  of  testable  predictions  takes  one 
out  of  the  realm  of  science.  (The  demographic  analysis  has 
built-in  attitudes  that  require  reasonable  results,  such  as  ad- 
joining States  should  have  similar  results  and  derived  sex 
ratios  should  be  consistent  with  past  experience.  Again, 
see  [7] ,  p.  9,  right-hand  column.) 

MATCHING 

If  the  same  population  segment  is  examined  several  times 
(census,  birth  certificates,  social  security,  etc.),  and  if  the 
records  for  individuals  are  matched  where  possible,  then  the 
numbers  of  nonmatched  records  can  be  used  to  estimate 
omission  rates.  Although  this  procedure  is  occasionally  used, 
its  cost1 1  prohibits  extensive  use  as  would  be  required  to 
obtain  estimates  of  State  and  sub-State  omission  rates. 
Because  the  procedures  used  to  obtain  large  sets  of  data  are 
very  complex,  the  matching  method  cannot  be  applied  in  a 
simpleminded  manner.  For  example,  imputed  census  records 
cannot   be   matched   to  other  records.  Matching  theory   is 


straightforward  in  the  unusual  situation  where  the  causes 
for  omission  are  statistically  independent  in  the  data  sets. 
The  criticisms  of  demographic  analysis  apply  to  matching. 
SPRR  uses  a  composite  method  where  results  from  demo- 
graphic analysis  and  matching  are  combined  by  taking 
averages. 

SYNTHETIC  OR  STATISTICAL  METHOD 

Once  useful  estimates  have  been  obtained  for  omission 
rates  at  some  geographic  level,  say,  States,  one  could  apply 
those  rates  to  smaller  geographic  areas.  For  example,  if  State 
omission  rates  by  race,  co-  and  B-,  are  available,  then  one 
could  compute 


VV1+co/)+V1+/V 


where  P-  is  the  estimated  population  and  {w~,  b--)  are  the 
census  racial  counts  for  the/th  region  in  State  /.  This  method 
is  an  obvious  exploitation  of  the  demographic  results  (co-,  |3-). 
The  exploitation  has  serious  appeal:  (1)  It  makes  direct  use 
of  census  counts— (w-,  b  •■).  (2)  The  method  makes  many 
predictions;  much  use  of  the  (to-,  fy).  (3)  Some  of  the  predic- 
tions can  be  checked,  since  the  P-  are  relatively  small  com- 
pared to  the  Pj }  2  It  is  not  uncommon  to  have  recounts 
In  1977,  there  were  258  special  censuses  (see  [8]).13  Also, 
careful  matching  of  records  for  relatively  small  popula- 
tions can  be  used.  These  checks  are  expensive  and  im- 
perfect. But  it  is  essential  to  have  some  empirical  base  to 
check  the  (demographic)  analysis.  At  least  there  is  an  indica- 
tion of  a  possible  empirical  verification  of  the  analysis.  Other 
methods  of  verification  should  be  sought. 

If,  as  a  result  of  these  checks,  the  predictions  appear  in- 
adequate, other  models  could  be  fitted.  For  example,  [6] 
introduced  a>^,  toH,  p^,  and  j3^,  where  (to^,  p^)  are  omis- 
sion rates  (national)  for  low-income  people  and  {co^,  j3^), 
the  corresponding  rates  for  people  at  other  income  levels. 
These  rates  must  satisfy 


and 


H/(1   +  to)  =  W.     (1  +  CO,  )  +  Wu  (1   +  COu) 


b(1+P)=bL  (1  +PL)+bHtf+pH) 


(42) 


(43) 


where  {w,b)  are  the  total  counts  of  the  races  and  (w,  =w-Wu 


'This  would  be  the  case  if  the  only  possibitities  were  the  demo- 
graphic analysis  and  the  census  count,  but  these  are  not  the  only 
choices. 

10The  demographic  analysis  of  the  Hispanic  population  appears 
very    complex;    at   least,   it   brings  many   new   problems.    (See    [9]  .) 

1 '  A  strategy  that  needs  exploration  is  the  expenditure  of  large 
sums,  on  the  order  of  $100  million,  to  help  modify  census  counts 
and  prepare  intercensal  estimates.  See  Roberts'  comment  in  [2]. 
(1979). 


1 2Some  of  the  following  material  is  anticipated  by  (6,  p.  13]  and 
[7,  pp.  4-5]. 

13 The  implicit  suggestion  being  made  here  is  that  many  of  the 
activities  of  the  special  census  could  be  used  to  help  in  the  modifica- 
tion. Because  of  time  of  occurrence  and  procedures  used,  recounts 
and  special  censuses  are  not  directly  comparable  to  the  decennial 
census.  Nevertheless,  with  careful  coordination  and  modest  pro- 
cedural changes,  such  activities  as  recounting  and  special  censuses 
might  play  an  important  role  in  the  analytical  activities  of  the  Bureau 
of  the  Census. 
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and  b^=b~b^)  are  the  total  numbers  of  poor  of  each  race. 
Then  estimate  P- ,  using  obvious  notation,  by 

Pj=wiL  (1+gjJ  +  wiH  [UcoH)  +  bjL  (1+/JJ  +  biH  (1+/^) 

(44) 

The  State  counts  by  race  and  economic  level  are  available 
from  the  census.  Siegel  [6]  was  led  to  this  model  in  an 
effort  to  explore  the  consequences  of  various  assump- 
tions; doing  a  sensitivity  study,  he  "arbitrarily"  chose  co, 
=  2(x>H  and  BL  =  2j3^.  At  the  sub-State  level  we  might  need 
quantities  such  as  (u>jL,  cojH,  BjL,  Bjhj).  As  a  result  of  the 
checks  on  the  simpler  model  (co-,  B.),  there  should  be  some 
empirical  evidence  to  help  estimate  co.,  and  B-. . 

Other  models  can  be  explored  empirically,  such  as 
P..  =  W.j  (1  +  W/)  +  b..  (1  +  ft.)  +  £  yik  (xijk-Xj  mk)    (45) 


where  x-k  is  the  /rth  economic  or  physical  or  social  variable 
associated  with  the  /th  region  in  the  /th  State.  Such  models 
are  very  general  so  that  substantial  consideration  is  required 
to  select  those  models  worthy  of  study.  Notice,  the  x-,  can 
be  functions  of  population  counts. 

The  above  exploitations14  of  the  output  of  the  demo- 
graphic analysis  appear  to  place  that  analysis  as  a  sensible 
part  of  the  estimation  problem.  There  is  an  opportunity  5 
for  empirical  verification  and  a  coherent  statistical  analysis. 
A  result  of  such  analysis  could  be  to  arrive  at  a  simpler 
model,  such  as 


P..  =  Wjj  (1  +  co)  +  b..  (1  +/»  +  Ji  yk(xJ!k-Xj  .k) 

THE  NEED  FOR  SPRR 

At  this  point,  we  can  ask  why  a  substantial  effort  was 
made  to  find  (co-,  B-)  in  preference  to  the  available  (co,B). 
There  are  acceptable  (to  Census)  values  for  co  and  B,  cor- 
rected for  sex  and  broad  age  groups.  Use  of  these  corrections 
does  not  appreciably  change  the  values  of  the  {PA,  and  the 
use  of  co,  and  B,  (as  above)  was  inconsequential,  (See 
[6],  p.  11.)  Since  SPRR  was  "developmental,"  the  work 
might  have  been  done  as  an  exercise  in  sensitivity  analysis. 
A  possible  motivation  for  the  work  was  empirical  evidence 
that   there   was   substantial    variation   among   the    {co-}  or 


{Bf}.  Reference  6  summarizes  the  empirical  evidence  regard- 
ing differential  undercount.  The  table  in  the  right-hand 
column  of  p.  6  indicates  there  is  a  regional  effect  as  well  as 
a  racial  effect.  Apparently,  a  set  of  State  estimates  has  been 
prepared  with  this  evidence.  (See  [4] ,  p.  6.)  An  explicit 
argument  from  the  empirical  evidence  to  the  plausible 
underenumeration  rates  has  not  been  made. 

Let  the  populations  generated  for  the  States  by  applica- 
tion of  the  national  values  of  (co,B)  be  called  basic.  The 
sensitivity  studies  suggested  by  this  evidence  include  the 
following  results: 

1 .  Inclusion  of  information  on  age  and  sex  does  not  cause 
substantial  change  from  basic. 

2.  Assuming  the  omission  rate  for  low  income  is  twice 
that  for  high  income  does  not  cause  a  substantial 
change  from  basic. 

3.  Making  omission  rates  proportional  to  income-the 
precise  procedure  not  being  explained-yields  more 
variability  and  more  large  values  for  the  State  omis- 
sion rates  than  basic. 

4.  With  "education"  replacing  "income,"  results  are  same 
as  3  above. 

Although  the  sensitivity  study  shows  plausible  results, 
the  evidence  is  not  convincing  that  any  of  the  used  assump- 
tions are  better  than  basic.  (The  alternative  computations 
were  done  primarily  to  illustrate  the  consequences  of  varia- 
tions in  omission  rates  for  population  segments,  but  pre- 
sumably the  cases  were  picked  because  of  their  plausibility.) 

Basic  is  not  appealing  to  some  because  it  does  not  create 
very  much  variability  between  the  States:  General  impres- 
sions, continued  interest,  and  sometimes  heated  discussion 
suggests  there  should  be  big  differences.  Thus,  part  of  the 
motivation  to  search  for  (to-, 3-)  in  preference  to  (co,B) 
is  to  find   an   anticipated   substantial    variation    in  the  tt .. 

Let  it*  be  generated  from  the  basic  synthetic  method,  that 
is, 


■n    = 


13b*  +  cow' 
b*  +  w* 


(46) 


Let   7r**  be  generated  by  a  synthetic  method,  where  the 
B.  and  co  ■  need  not  be  constant.  So 


B*b*  +  co*b* 


7T 


+    W 


(47) 


14 The  demographic  analysis  supplies  a  part— possibly  major— of 
the  total  modification. 

l5The  methodology  requires  the  data  for  verification.  Demo- 
graphic analysis  can  get  along  without  it  as  far  as  it  can  go.  Having 
the  need  for  data  to  apply  the  statistical  methods  can  help  obtain  the 
data.  Unless  such  data  are  created,  the  results  of  every  modification 
procedure  cannot  be  assessed.  If  such  data  are  not  to  be  obtained, 
one  must  consider  abandoning   the  idea  of  modifying  census  data. 


Assume,  as  usual, 


p  „**  -  p  „*  -  „-  tow  +  Bb 
t  n      =  t  it    -  n £— 

w  +  b 


(48) 


and  the  (b*,  w*)  is  the  same  random  variable  in  (46)  and 
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(47).  Now  compute 

V(Tr**)=E(lT**-TT*  +  TT*-TT)2 

=  V(n*)  +  2C(tt*  -  77,  n**  -  n*)  +  V(n  **  -  77*)     (49) 

A  sufficient  condition  for  77**  to  have  larger  variance  than 
77*  is  to  have  C(7r*  -  77,  77**  -  77*)  >  0  (and  77*  *  77**).  This 
will  happen  when  77*  —the  basic  modification— and  77**  -  77*— 
the  additional  modification— are  uncorrected;  if  the  addi- 
tional modification  is  "noise,"  this  will  be  the  case.  For  ex- 
ample, assume 


and 


/T=j3  +  5 


oj    =to  +  e 


Thus,  for  fixed  m,  and  increasing  m2,  the.  ratio  of  b/a  must 
be  fixed  and  the  ratio  of  b/a  +  1  must  decline.  That  is,  b 
must  decline,  which  forces  more  probability  into  the  right 
tail.  Consider  the  case  mx  =  0.025,  which  implies  b  =  39a. 
For  a  >  1,  the  density  has  one  mode  (a-  Ma+b-2).  For 
a  <  1 ,  a  mode  appears  at  the  right. 

Thus,  preparation  of  (fy  ,03.)  instead  of  the  use  of  (0,gj) 
to  generate  (77^)  will  tend  to  increase  the  variability  and  the 
number  of  large  omission  rates  for  the  States.  This  result  will 
occur  whether  the  job  is  done  well  or  poorly.  This  increase 
in  variability  agrees  with  common  desire,  but  it  also  needs  a 
strong  empirical  base. 

The  SPRR  demographic  analysis  does  not  develop  for 
individual  States  specific  omission  rates  that  are  strongly 
favored  as  being  correct  or  better  than  the  synthetic  esti- 
mates. The  analysis  works  with  estimates  arising  from  a  vari- 
ety of  plausible  assumptions.  As  a  collection,  these  estimates 
appeal  to  SPRR  because: 


where  £5*  =  E  e*  =  0  and  (5*,  e*)  is  independent  of  (b*,w*). 
Then 


/pb*  +  uw*\  /8*b*  +  e*w*\ 

C(n*,ir**-TT*)  =  E[Z- — —  =0   (50) 

\b*  +  w*      )  \      b*  +  w*    I 


Thus,  any  modification  beyond  basic,  which  is  independent 
of  (b*,  w*),  will  increase  the  variance,  that  is  V(n**)>  V(n*). 
In  fact,  for  this  model. 


J5*b*  +  e*w\ 

V(n**)  =  V(tt*)  +  E\ f 

\  b*  +  w*    I 


(51) 


In  summary,  one  would  anticipate  1/(77**)  >  1/(77*)  in  many 
situations;  the  occurrence  of  the  inequality  is  not  suggestive 
that  the  modification  was  particularly  worthy. 

It  also  should  be  noticed  that  if  E  77**  =  E  77*  =  77  and 
V(-n**)  >  V(77*),  then  77**  will  tend  to  have  more  large 
observations  than  77*-assume  that  P(tt**<0)  =P(77*<0)  =0. 
As  an  example,  assume  X  =  X(ml,m2)  is  a  Beta  random 
variable  with  first  two  moments  mx,m2.  Then  the  density 
satisfies 


r(a+b) 


a-  1  /-,    „\b-  1 


with 


na)r(6)*d    '  d-x) 


a  -    1 

AT?!    -<— 


a+b      1  +  b/a 


(52) 


(53) 


1.  The  variability  between  States,  with  the  estimates  pro- 
posed by  SPRR,  is  substantially  larger  than  the  vari- 
ability from  basic  synthetic  assumptions. 

2.  In  spite  of  1  (above),  the  SPRR  series  does  not  gener- 
ate any  outrageously  large  omission  rates.  (Actually, 
the  SPRR  series  appears  to  me  to  produce  some 
excessively  small  omission  rates— including  a  few  (<3) 
negative  rates.) 

3.  The  range  of  omission  rates  for  a  State  generated  by 
the  different  assumptions  is  modest. 

4.  The  correlation  between  omission  rates  for  the  States 
for  some  pairs  of  sets  of  assumptions  is  high  (<0.9). 
(The  SPRR  and  synthetic  omission  rates  have  a  very 
low  correlation  (SO. 2).) 

5.  The  geographical  pattern  of  SPRR  omission  rates  is 
consistent  with  other  data. 

Although  the  SPRR  State  omission  rates  do  not  indi- 
vidually have  strong  appeal,  they  do  seem  appropriate, 
along  with  the  synthetic  estimates,  as  useful  quantities  to 
help  in  the  estimation  of  subnational,  possibly  sub-State, 
population  counts. 

TECHNICAL  CRITICAL  REMARKS 

The  following  remarks  on  SPRR  methodology  will  help 
indicate  why  their  estimates  of  omission  rates  are  not  in- 
dividually favored.  This  analysis  will  concentrate  on  the 
white  population  under  35  years  of  age;  this  is  the  popula- 
tion segment  where  the  SPRR  techniques  work  best. 


and 


Insufficient  Reason 


m: 


a(a+1) 


m. 


(a+b+1)  (a+b)  = 


1  + 


a  +  1 


(54) 


"Since  empirical  evidence  cannot  be  found  to  support  a 
particular  weighting  scheme,  the  most  practical  and  techni- 
cally defensible  approach  is  to  use  a  scheme  in  which  the 
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two  limiting  values  are  given  equal  weight"  (SPRR,  p.  26). 
This  is  certainly  "practical"  in  the  sense  of  easy,  but  there 
are  even  easier  things.  The  "technically  defensible"  aspect  is 
obscure.  If  one  of  the  "limiting  values  is  too  extreme,  the 
average  is  a  very  poor  guide.  Further,  "inasmuch  as  the  biases 
went  in  opposing  directions  and  their  relative  magnitudes 
could  not  be  determined,  the  two  sets  of  State-of-birth 
estimates  were  averaged  with  equal  weights...  "  (p.  26). 
SPRR  presents  these  remarks  in  a  harsh  and  unthinking 
manner,  but  the  reader  suspects  that  much  thought  and 
discussion  went  into  these  decisions.  ".  .  .  in  the  absence  of 
evidence  regarding  the  State-of-birth  distribution  of  the 
nonresponses,  they  were  assigned  according  to  two  limiting 
assumptions...  "  (p.  26).  In  this  case,  SPRR  has  clearly 
given  much  attention  to  the  problem;  see  "Data  Quality," 
below. 

Complexity 

This  essay  is  not  a  review  of  SPRR;  I  have  not  indi- 
cated what  SPRR  has  done.  The  analysis  involves  many 
definitions,  variables,  and  decisions.  The  authors  are  in 
a  "developmental"  stage  rather  than  production.  They 
hint  at  each  path  they  started  to  follow.  The  steps  are  com- 
plex, so  that  one  is  not  encouraged  to  derive  statistical  pro- 
perties of  the  end  product  from  assumptions  about  the  begin- 
ning. Reported  high  correlation  between  two  series  of  State 
omission  rates  might  be  evidence  that  one  of  the  series  is 
constructed  to  be  a  near  linear  function  of  the  other  or  that 
presumed  statistically  independent  sources  of  information 
are  not  independent.  In  SPRR,  it  is  not  easy  to  see  which, 
if  either,  is  correct. 

Data  Quality 

Apparently  a  major  source  of  variation  between  esti- 
mated State  omission  rates  is  the  poor  quality  of  the 
1970  census  data  on  the  question  regarding  State  of 
birth.  For  whites  under  35  years  old,  about  4  percent  of  the 
data  are  missing;  a  substantial  amount  of  the  obtained  re- 
sponses must  be  wrong  (table  ll-A).  SPRR  does  much  ex- 
ploratory work  that  indicates  the  missing  data  are  not  a 
serious  problem.  The  response  errors  arise  from  a  variety 
of  reasons.  The  correct  response  is  the  State  of  residence  of 
the  mother  when  the  child  is  born.  Apparently  people  often 
give  the  location  of  the  hospital  where  the  birth  occurred 
or  the  current  State  of  residence.  SPRR  does  two  basic 
analyses:  (1)  Treats  the  census  response  as  where  birth 
occurred  (table  ll-A,  col.  3).  (2)  Treats  the  census  response 
as  residence  at  time  of  birth  (table  ll-A,  col.  1).  The  mean  ab- 
solute difference  between  the  generated  omission  rates  is 
about  1  percent  and  the  average  omission  rate  is  about  2  per- 
cent. (It  should  be  noted  that  the  1  percent  does  not  include 
the  District  of  Columbia,  78  percent;  Maryland,  13  percent; 
and  Virginia,  6  percent.)  SPRR  takes  the  average  of  these 


two  columns  as  the  basic  estimate  of  omission  rates.  Not 
only  is  the  use  of  "insufficient  reason"  discomforting,  but 
we  are  in  the  position  of  accepting  the  average  of  two  bad 
observations  as  being  an  improvement.  We  are  not  sure  the 
truth  lies  between  the  limits.  The  two  rates  for  Texas,  4.5 
and  4.3  percent,  are  close  together,  and  they  seem  too  high. 
All  of  this  is  to  say  that  a  major  source  of  variation  in 
the  SPRR  omission  rates  is  bad  data,  which  perhaps  should 
not  have  been  used.  Of  course,  not  to  use  the  data  on  State 
of  birth  is  a  great  loss  because  they  are  the  migration  data. 
Without  them,  demographic  analysis  apparently  cannot 
begin;  perhaps  they  will   be  better  in  the    1980  census.16 

APPENDIX  1 
STATISTICAL  MODELS 

Three  stochastic  models  of  the  counting  and  modification 
process  are  given.  These  illustrate  the  possibility  of  con- 
sistency and  some  types  of  assumptions  that  might  be  of  use 
in  practice. 

Model  1.  As  a  part  of  a  Monte  Carlo  study,  one  could 
generate  a  stochastic  model  after  the  data  are  collected  and 
the  rates,  B-  and  to.,  are  determined  by  some  method.  In 
particular,  the  Monte  Carlo  model  could  use 


b, 


P(B*  =  8.)  =  ^L 
'         K        n 

1    I 


and 


P(u>f*  =  03  k)  = 


w. 


n 
2  iv- 

1      ' 


for  each  /  and  k  from  1  to  n.  This  would  not  be  the  com- 
plete model,  but  it  is  enough  for  some  analysis.  Thus,  from 
(11) 

n 

Etf-rLLL  =b 

n 

Z  b- 
/=1    ' 


and 


r-  * 

E  03;       =03 


For  the    1970   census,    it    is  assumed  that  |3  =  0.077  and 
co  =  0.019. 


16 The  lack  of  good  internal  migration  data  is  thought  to  be  a 
real  shortcoming  in  the  U.S.  data  base.  If  arguments  were  made 
to  demonstrate  the  usefulness  of  such  data,  perhaps  its  collection 
could  be  justified. 


74 


Model  2.  In  model  1,  the  omission  rates  were  assigned  at 
random  with  probability  proportional  to  size  of  the  racial 
count  in  the  State.  Another  framework  for  a  Monte  Carlo 
experiment  is  to  introduce  (b* ,  w*)  as  the  counts  of  the 
races  for  a  randomly  selected  State.17  The  assumed  device 
for  generating  the  random  pair  (b*,  w*)  is 


P(b< 


o.and  w 


IV;)  =—  for  /=1  to  n 

1      P 


That  is,  the  probability  of  selecting  the  State  /data  is  propor- 
tional to  the  count  of  State  /  .  Also,  (|3.  ,tof)  have  been  com- 
puted by  some  method  for  each  /.  Define  (]3*,  cj*)  as  the 
omission  rates  associated  with  (6*,  w*).  Now  assume  that 
(b*,  w*)  and  (/?*,  co*)  are  independent;  size  and  omission 
rate  are  independent.  Further,  assume  Fj3*=j3  and  £co*=to; 
that  is,  the  model  satisfies  a  consistency  condition  like  that 
in  model  1 .  Then,  with 
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the  following  consistency  result  is  obtained 
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APPENDIX  2 
RANDOM  BIAS 

A  consequence  of  assumptions  (1)  and  (3)— below  (31) 
in  section  3— is 

P*        P* 
EL-  =EtL 

P  P 

The  expected  proportion  of  the  count  in  a  State  is  the  same 
before  and  after  modification.  The  relationship  is  derived 
directly: 

ET=  £p*(1+ir*)  =  EP^E  1_Mr_*  =  £p_* 

P  p  (1  +7T)  p  1  +7T  p 

Of  course,  the  modification  process  will  change  the  popula- 
tion of  each  State.  At  the  model  level,  modification  designed 
to  remove  bias  does  not  make  a  change  in  the  expected 
proportion  of  the  population  in  a  State.  At  the  practical 
level,  even  with  this  model,  modification  might  be  desirable. 
If  the  process  is  effective,  modification  will  give  allocations 
closer  to  congressional  intent  than  would  be  obtained  with 
the  original  counts.  (Here,  we  are  thinking  of  typical  alloca- 
tion methods  that  depend  on  population  proportions.)  Con- 
sequently, the  process  of  modification  increases  equity.  In 
this  model,  the  bias  is  random,  as  the  effects  are  random  in 
a  random-effects  model. 
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Comments 


William  G.  Madow 

Consultant 


Estimates  of  undercoverage  of  the  national  population 
have  been  made  using  demographic  analysis  for  the  1950, 
1960,  and  1970  censuses  of  population. 

The  need  for  estimates  of  population  of  subnational 
areas  in  the  United  States  adjusted  for  undercoverage  has 
grown  considerably  since  the  1970  census  because  of  the 
many  Federal  programs  allocating  funds  on  the  basis  of  pop- 
ulation and  other  census  data. 

In  1977,  the  Bureau  of  the  Census  issued  a  report,  re- 
ferred to  as  SPRR  in  Savage's  paper  and  in  these  comments, 
which  presented  "developmental"  estimates,  also  made  by 
demographic  analysis,  of  undercoverage  by  States. 

Savage's  paper  consists  primarily  of  a  critical  review  of 
making  estimates  by  demographic  analysis,  as  in  SPRR, 
especially  when  the  objective  is  to  modify  undercounts, 
and  a  discussion  of  why  he  believes  it  preferable  to  use  sta- 
tistical methods  for  these  purposes,  including  Bayesian 
methods.  My  basic  comment  on  Savage's  paper  is  that, 
although  his  discussion  of  SPRR  includes  a  discussion  of  the 
feasibility  of  making  adequate  subnational  estimates  by 
demographic  analysis,  it  does  not  include  a  discussion  of  the 
feasibility  of  making  such  estimates  by  statistical  methods. 
This  comment  is  not  necessarily  a  criticism.  I  believe  Savage 
wishes  both  to  avoid  having  demographic  analysis  accepted 
prematurely  as  the  preferred  method  for  making  subnational 
estimates  merely  because  such  estimates  were  adopted  at 
the  national  level  for  the  censuses  of  1950,  1960,  and  1970, 
and  to  persuade  the  decisionmakers  to  make  a  thorough  and 
expensive  effort  at  developing  estimates  by  statistical 
methods  intended  to  calibrate  the  census  counts  rather  than 
replace  them  as  in  demographic  analysis. 

The  conference  indicates  that  the  Bureau  of  the  Census 
is  giving  careful  consideration  to  both  demographic  and 
statistical  methods  of  making  subnational  adjusted  estima- 
tions of  the  1980  census.  SPRR  does  seem  to  be  a  greater 
effort  in  the  development  of  the  demographic  analysis  than 
has    at  least  so  far  been  visible  for  statistical   approaches. 

Let  us  turn  to  some  more  specific  comments  on  Savage's 
paper.  Demographic  analysis  uses  data  from  outside  the 
current  census  in  an  attempt  to  approximate  error-free 
census  values.  Different  data  are  used  for  approximations  to 
different  parts  of  the  tables  of  population  by  age,  sex,  and 
race.  When  an  algebraic  expression  exists,  e.g., 

Population  =  Births-  Deaths  +  Immigration-  Emigration 

for  a  reasonable  past  time  period,  then  accuracy  of  the  popu- 
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lation  estimate  is  acheived  if  each  of  the  four  terms  on  the 
right  is  accurate.  However,  when  one  or  more  of  the  terms  on 
the  right  becomes  inaccurate,  e.g.,  immigration,  then  the 
estimate  becomes  statistical  in  the  sense  that  the  population 
estimate  may  be  subject  to  errors  depending  on  the  errors 
of  the  component  variables.  One  method  of  examining  the 
error  structure  of  any  random  variable  that  is  a  function  of 
other  (component)  random  variables  is  to  determine  the 
variability  of  the  function  induced  by  assigning  sets  of  possi- 
ble values  to  the  component  variables.  This  is  done  in  SPRR. 
One  or  more  probability  distributions  of  these  component 
random  variables  may  be  assumed  based  on  knowledge, 
judgment,  experience,  or  anything  else.  Whether  or  nof  those 
doing  demographic  analysis  make  this  step,  it  is  available  to 
them.  Thus  I  see  no  reason  to  contrast  demographic  and  sta- 
tistical analysis;  it  is  only  that  one  of  these  methods  (demo- 
graphic analysis)  is  sometimes  in  the  position  where  what 
it  is  desired  to  estimate  can,  with  sufficient  insight,  be  ex- 
pressed in  terms  of  essentially  error-free  component  vari- 
ables, i.e.,  the  estimate  is  made  by  deterministic  methods. 
(Clearly,  I  mean  these  statements  to  be  taken  relatively,  not 
absolutely.)  For  this  reason,  Savage's  assertion  "Estimates 
of  population  sizes  from  demographic  analysis  would  be  hard 
to  accept"  is  much  too  strong. 

As  SPRR  shows,  in  dealing  with  subnational  estimates,  the 
component  variables  are  unlikely  to  be  error  free  and  thus  a 
major  reason  for  preferring  demographic  analysis  is' lost.  The 
sizes  of  error  must  be  taken  into  account.  Also,  in  1980, 
immigration  would  be  more  in  error  than  in  earlier  censuses. 

Savage  feels  that,  for  problems  of  undercount,  demo- 
graphic analysis  has  weaknesses  because  it  does  not  permit 
the  testing  of  hypotheses  and  yields  estimates  of  under- 
count using  data  from  outside  the  population  census  itself. 
These  are  not  serious  problems  for  estimating  the  undercount 
and  adjusting  for  it  if  the  outside  data  are  accurate.  The 
undercount  results  in  biases  in  important  totals  rather  than 
increased  variances;  even  if  at  some  point  (e.g.,  in  synthetic 
estimates)  those  not  counted  are  treated  as  though  they 
are  "missing  at  random,"  higher  level  or  marginal  totals 
would  still  be  biased,  if  not  adjusted,  perhaps  to  totals  ob- 
tained by  demographic  analysis.  To  test  hypotheses  that 
biases  are  equal  implies  that  unbiased  or  very  good  estimates 
of  the  biases  are  available.  This  places  requirements  on  PES 
and  matching  studies  that  are  unlikely  to  be  satisfied.  Thus 
data  may  not  be  available  for  a  satisfactory  statistical  test  of 
hypotheses  by  statistical  analysis.  It  is  not  at  all  unusual  to 
use    "benchmarks"   obtained    outside   a  survey  to   improve 
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estimates  of  totals.  What  matters,  if  one  is  using  biases  and 
variances  as  indicators  of  error,  is  which  alternative  estima- 
tion procedure  produces  the  smaller  bias  and  variance.  If 
good  enough  PES  and  matching  studies  could  be  made, 
then  good  estimates  of  bias  or  relative  bias  could  be  obtained 
for  smaller  areas  than  those  for  which  demographic  analysis 
may  be  expected  to  provide  good  estimates  of  bias.  Real- 
istically, achieving  adequate  quality  of  PES  and  matching 
studies  for  good  estimates  of  bias,  especially  for  smaller 
areas,  seems  ulikely. 


Thus,  while  I  may  share  Savage's  views  of  what  statistical 
models  might  accomplish  under  more  ideal  conditions  than 
those  we  now  face,  it  seems  unrealistic  to  make  choices  on 
the  bases  of  the  images  of  what  statistical  methods  might 
accomplish  rather  than  intensive  studies  of  the  alternative 
approaches.  I  do  want  to  express  my  agreement,  however, 
with  Savage  in  his  not  explicitly  expressed  belief  that  for 
subnational  estimates,  until  more  evidence  is  in,  demographic 
analysis  should  not  have  a  preferred  position;  it  is  one  of 
several  alternatives  that  must  be  studied. 


FloorDiscussion 


The  Census  Bureau  provided  some  background  to  the  dis- 
cussion by  relating  the  imputation  of  persons  in  the  1970 
census  to  the  undercount  issue.  It  was  noted  that  approxi- 
mately 5  million  persons  were  added  to  what  might  be  called 
the  direct  interview  population— those  who  were  counted  on 
the  basis  of  household  responses.  The  information  and  basis 
by  which  several  million  people  were  added  arose  from 
census  operations— a  check  on  units  designated  by  enumera- 
tors as  being  vacant  and  a  check  by  the  post  office  of  housing 
units  listed  in  certain  areas  of  the  South.  A  few  million  more 
persons  were  added  for  mechanical  reasons.  It  was  estab- 
lished from  field  counts,  for  example,  that  a  given  enumer- 
ation district  had  so  many  people.  In  tabulations  in  the 
processing  center,  far  fewer  persons  were  recorded.  On  the 
basis  of  the  information,  it  could  be  inferred  that  records 
were  lost,  some  people  were  lost,  and  these  people  were 
"made  up."  Persons  imputed  in  this  way  were  not  part  of 
the  undercount  estimated  for  1970. 

More  details  concerning  the  5  million  persons  that  were 
added  to  the  census  counts  were  requested  by  the  con- 
ference participants.  There  was  interest  in  whether  the 
Census  Bureau  might  use  information  on  unit  conversions 
to  impute  housing  units  in  some  urban  areas,  or  how  the 
Census  Bureau  determined  what  numbers  of  people  to 
allocate,  for  example.  It  was  also  suggested  that  such  addi- 
tions were  a  precedent  for  the  Census  Bureau  adjusting  the 
counts.  The  Census  Bureau  indicated  that  there  were  about 
5  million  persons  in  the  counts  for  whom  one  would  be  un- 
able to  obtain  further  information.  The  two  programs  that 
imputed  with  the  least  evidence  were  the  national  vacancy 
check  and  the  postenumeration  post  office  check.  About 
1  million  persons  were  imputed  as  a  result  of  the  national 
vacancy  check.  That  check  was  implemented  in  the  summer 
of  1970  on  a  national  basis  to  try  to  correct  for  a  problem 
that  was  noted— the  enumerators  were  apparently  classifying 
housing  units  as  vacant  that  were  occupied.  That  check 
yielded  about  1  million  persons,  based  on  a  sample  survey 
in  which  the  conversion  rates  were  identified  for  each  area 
and  the  computer  was  programmed  to  convert  some  vacant 
units— the  proportion  estimated  by  the  survey— to  occupied 
units  and  to  impute  the  characteristics  of  neighboring  house- 
holds. 

In  most  areas,  the  census  was  taken  by  mail  and  for  these 
areas  the  post  office  updated  the  mailing  list.  To  try  to  be 
equitable  for  nonmail  areas,  the  Census  Bureau  instituted 
this  post  office  check  in  the  South,  where  the  undercount 
traditionally  has  been  the  highest.  The  Bureau  gave  the  post 


office  all  of  the  addresses  it  knew  about  and  the  postal 
carriers  checked  these  addresses  and  identified  the  ones 
they  felt  had  been  left  out  of  the  census.  Original  plans  had 
called  for  processing  on  a  100-percent  basis,  but  due  to  the 
volume  and  lateness,  it  was  felt  that  this  kind  of  program 
could  not  be  implemented.  Therefore,  the  Bureau  visited  a 
sample  of  these  to  see  what  proportion  of  the  post  office 
reports  of  missing  addresses  were  left  out  of  the  census.  The 
rates  were  found  for  various  areas  in  the  South,  and  the  com- 
puter was  programmed  to  impute  this  proportion  of  missing 
addresses  in  the  South. 

A  very  large  part  of  the  5  million  imputations  were  refer- 
red to  as  mechanical  errors.  In  processing,  often  things  will 
happen  to  data,  even  though  the  questionnaires  were  fairly 
well  filled  out' by  the  enumerators.  One  of  the  techniques 
used  to  handle  this  problem  was  to  go  back  to  the  address 
register,  where  the  number  of  persons  for  each  questionnaire 
was  recorded  and  to  impute  that  number  of  persons  into  the 
census  files.  Approximately  2.5  million  persons  were  im- 
puted because  of  mechanical  errors. 

Concern  was  expressed  that  the  design  of  the  PES  has  still 
not  been  determined,  and  the  timeliness  of  the  PES  was 
questioned  with  the  interview  being  conducted  long  after  the 
census.  Several  questions  were  raised  relative  to  the  PES  and 
three  suggestions  were  made:  (1)  The  PES  could  conceivably 
be  subcontracted  out  if  the  Bureau  could  deputize  an  out- 
side agency  to  handle  confidentiality.  Thus,  independently, 
the  PES  could  be  timed  to  coincide  with  the  census.  (2) 
Possibly  monetary  incentives  could  be  used  for  response  to 
the  PES  on  a  relatively  small  scale  at  intervals  throughout 
the  decade.  This  should  differ  from  the  CPS  in  that  it  would 
only  focus  on  hard-core  census  information,  which  would 
help  to  study  methods  for  undercount  improvement  and  get 
a  better  over-all  projection  of  current  population  estimates. 

The  Census  Bureau  indicated  that  the  PES  is  well  designed, 
at  least  in  terms  of  how  the  questionnaire  is  to  be  designed, 
how  to  collect  and  process  the  data,  a  timetable  for  process- 
ing, and  most  of  the  ideas  on  matching  and  estimation.  Over 
the  past  2  to  3  years,  the  Census  Bureau  has  been  conducting 
pretests  and  has  noted  that  PES  techniques  seem  not  to 
produce  adequate  data  to  estimate  the  undercount.  As  a 
result,  many  things  have  been  tried  to  overcomewhat  appears 
to  be  some  of  the  deficiencies  in  PES  conclusions.  One  of 
these  is  that  there  is  a  correlation  bias  between  the  PES  and 
the  census;  in  other  words,  a  person  left  out  of  the  census 
also  tends  to  be  left  out  of  PES,  perhaps  for  some  of  the 
same  reasons,  such  as  deliberate  concealment. 
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The  Census  Bureau  agreed  that  if  the  PES  were  conducted 
on  the  same  day  as  the  census,  or  shortly  thereafter,  some  of 
the  problems  may  be  minimized.  Two  additional  problems 
may  be  expected,  however.  The  first  is  that  if  people  are 
enumerated  in  the  census  and  the  PES  too  closely  together, 
it  is  likely  to  bias  either  the  census  or  the  PES— likely  to  con- 
dition responses— and  thus  make  the  estimated  undercount 
even  more  biased.  The  second  is  that  the  census  is  more 
important  than  the  PES,  therefore,  the  PES  should  not  be 
allowed  to  affect  the  census  counts. 

With  respect  to  the  use  of  monetary  incentives,  this  ap- 
proach was  studied  as  part  of  the  1972  Consumer  Expendi- 
ture Survey,  and  the  results  indicated  that  the  money  is 
better  spent  on  enumerator  training. 

The  Census  Bureau  also  discussed  the  matching  for  the 
PES.  The  problems  of  using  administrative  records  are  well 
known;  in  doing  a  match  of  two  lists  using  name,  sex,  age, 
and  race,  both  matched  and  nonmatched  cases  may  be  in 
error,  and  this  problem  is  magnified  when  one  tries  to  match 
perfectly.  The  possibility  for  errors  resulting  from  coding 
the  sample  cases  also  was  mentioned.  The  Bureau  would 
like  to  guard  against  developing  undercount  figures  that  are 
really  describing  a  mismatch  rate. 

The  Census  Bureau  then  questioned  the  conference  par- 
ticipants as  to  their  views  on  administrative  record  match- 
ing. The  Bureau  envisions  two  solutions  if  one  were  not  sat- 
isfied with  dual-system  estimation:  One  is  to  undertake 
triple-system  instead  of  dual-system  estimation,  although 
it  was  not  felt  that  triple-system  would  yield  significantly 
more  information  than  dual-system  estimation.  The  second 
way  would  be  to  do  a  second  dual-system  estimation  using 
the  PES  and  administrative  records. 

It  was  indicated  by  some  that  two  dual  systems  would 
be  more  feasible  in  reducing  matching  problems,  but  that 
there  could  be  problems  with  that  approach.  The  main 
problem  is  bias.  The  correlation  bias  of  the  PES  is  well 
known,  but  not  as  much  is  known  about  the  bias  of  admini- 
strative records  themselves;  and  the  sum  of  the  two  biases 
remains  uninvestigated.  It  was  thought  that,  in  terms  of 
picking  a  sample  estimate,  it  is  an  issue  of  which  of  the 
two  biases  is  likely  to  be  less. 

The  point  was  emphasized  that  procedures  used  by  the 
Census  Bureau  in  imputation  are  based  on  the  probability 
of  categories  of  households  or  the  probability  of  people 
being  missed.  What  ought  to  be  done  with  dual-  or  triple- 
system  estimation  is  to  estimate  as  best  as  possible  what 
missed  rates  are  for  important  categories  of  people  and,  if 
that  can  be  done,  the  issue  of  the  two  Siegel  papers  could 
be  resolved.  That  is,  is  it  true  that  blacks  and  whites  in  differ- 
ent age-sex  groups  are  missed  on  nationally  consistent  rates, 


or  is  it  a  fact  that  certain  places  where  blacks  happen  to 
live  are  hard  to  enumerate?  It  was  speculated  that  there  are 
particular  areas  that  are  hard  to  enumerate,  and  blacks  and 
Hispanics  live  there. 

The  Census  Bureau  indicated  that  in  triple-system  estima- 
tion the  information  obtained  from  a  three-way  match  yields 
information  for  seven  out  of  eight  cells  representing  the  pop- 
ulation—the eighth  cell  representing  those  persons  not  on  any 
of  the  three  lists.  An  estimate  of  the  eighth  cell  can  be  made 
either  by  taking  into  account  the  other  seven  cells,  or  by  col- 
lapsing the  information  into  three  two-way  tables  of  four 
cells  each— the  fourth  cell  of  each  being  empty— and  averag- 
ing the  four  estimates  of  the  vacant  cells.  The  difficulty  with 
triple-system  estimation  is  that  while  it  reduces  the  bias, 
there  is  also  an  increase  in  variance. 

The  Bureau  also  argued  that  while  direct  estimates  can  be 
used  for  sampled  areas,  regression  or  synthetic  estimation 
must  be  used  for  the  thousands  of  areas  not  in  the  sample. 
The  problem  with  using  synthetic  methods  is  that  to  get 
a  good  fit,  so  many  degrees  of  freedom  may  be  used  that  the 
variance  may  rise  rapidly.  On  the  other  hand,  one  could  run 
a  curve  through  the  data,  which  is  not  a  problem  if  one  is 
thinking  of  a  linear  curve  with  eight  variables;  but  the  curves 
may  be  curvilinear,  involving  the  products  of  three  of  the 
variables.  Use  of  regression  to  smooth  out  difficulties  was 
considered,  but  a  suitable  form  for  the  regression  could  not 
be  suggested. 

Questions  then  arose  concerning  the  possible  scope  of  the 
adjustments.  It  was  noted  that  the  discussion  thus  far  had 
focused  only  on  population  adjustments.  Other  factors  such 
as  income  are  important  factors  for  Federal  fund  distribu- 
tion, and  it  was  speculated  what  assumptions  would  be  made 
about  other  population  characteristics  if  there  are  population 
adjustments.  The  Census  Bureau  responded  that  it  has  pub- 
lished reports  showing  the  effects  using  revenue-sharing 
formulas  of  adjusted  population  only,  per  capita  income, 
and  all  components— population,  per  capita  income,  and 
income.  For  local  areas,  the  formula  reduces  virtually  to  an 
adjustment  of  per  capita  income  squared  and,  except  for 
certain  areas  constrained  because  of  limits  placed  on  per 
capita  allocations,  the  adjustment  eliminates  any  effect  of 
population  at  the  local  level.  Thus,  in  some  sense,  for  most 
areas  the  adjustment  for  population  becomes  an  academic 
issue  when  one  is  talking  about  adjusting  the  data  in  the 
revenue-sharing  formula  for  local  areas. 

While  it  seems  counterintuitive,  population  adjustments 
have  virtually  no  effect  on  the  distributions  in  the  general 
revenue  sharing  allocation  system.  The  income  adjustment 
"drowns  out"  the  population  adjustment.  In  addition,  most 
places  lose  money  when  adjustment  is  made  for  undercount. 
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INTRODUCTION 

Though  ours  is  a  conference  on  census  undercounts,  it 
cannot  stand  in  complete  isolation  from  other  kinds  of  miss- 
ing data.  With  regard  both  to  its  sources  and  to  its  remedies, 
it  can  and  should  be  linked  to  what  we  know  about  the 
diverse  types  of  missing  data,  and  also  to  what  more  we  need 
to  know  about  them. 

The  problems  of  undercounts  are  neither  entirely  similar 
to  nor  entirely  different  from  other  problems  of  missing 
data.  This  is  worth  stating  because  both  those  extremes  have 
partisans  who  hold  them  with  confidence,  also  with  some 
justification.  One  extreme  position  claims  that,  since  under- 
counts are  merely  one  type  of  missing  data,  the  justifications 
and  the  methods,  adjustments,  weightings,  and  imputations, 
which  are  used  for  other  types  of  missing  data,  should  also 
be  applicable  to  undercounts.  The  other  extreme  holds  that 
census  undercounts  must  be  seen  as  a  completely  different 
problem,  because  both  politically  and  basically  census  data 
are  and  must  remain  the  standards  of  comparison,  hence 
attempts  to  use  auxiliary  data  are  wrong  as  well  as  difficult 
or  futile. 

Both  extremes  have  been  ably  argued  in  this  conference. 
However,  most  of  us,  I  believe,  have  come  to  hold  a  position 
between  those  extremes,  for  reasons  noted  below.  And  such 
compromise  positions  suggest  that  the  Census  Bureau  should 
publish  (in  due  time)  population  estimates  adjusted  for 
estimates  of  noncoverage,  but  also  suggest  that  those  esti- 
mates be  clearly  distinguished  from  the  census  counts,  which 
should  continue  to  denote  persons  and  units  for  whom  direct 
evidence  was  perceived  in  the  census  enumeration. 

Some  other  related  beliefs  and  matters  may  be  treated 
merely  in  passing  here,  though  they  have  been  or  are  impor- 
tant in  other  times  or  contexts.  First,  it  is  not  common  any 
more  to  believe  that  census  counts  are  free  from  errors  and 
omissions,  or  can  be  so  made  either  by  definition  or  by  force 
of  legal  requirements.  The  courageous  disclosure  policy  of 
the  Census  Bureau  helped  to  dispell  that  myth. 

Second,  we  do  not  now  believe  that  with  reasonable 
effort  and  expenditure  that  the  undercounts  can  be  reduced 
to  the  vanishing  point,  or  even  perhaps  to  any  worthwhile 
extent  [1].  There  is  less  agreement  about  the  possibility 
that  some  thorough,  skillful,  and  expensive  methods,  applied 
to  samples,  could  perhaps  measure  and  estimate  the  under- 
counts reasonably  well.  Some  sort  of  postenumeration 
survey,  together  with  some  dual  records  systems,  are  usually 
proposed  here. 


Third,  we  are  willing  to  consider  the  problems  of  popula- 
tion separately  from  other  errors  of  observations.  Never- 
theless, relations  to  other  forms  of  missing  data  must  be 
noted  below.  Furthermore,  relationships  to  errors  of  observa- 
tions have  been  noted  in  references  to  "faulty  and  missing 
data"  in  the  remarks  of  Director  Barabba  here  and  in  the  title 
of  an  American  Statistical  Association  session  in  1978. 
Also,  powerful  arguments  have  been  advanced  here  about  the 
link  of  misstatements  about  income  to  population  under- 
counts for  equity  formulas.  We  must  remain  aware  that  the 
separation  of  population  undercounts  from  other  errors 
is  an  artifact  for  the  sake  of  simplicity.  Corrections  of  various 
kinds— editing,  imputing,  weighting— are  commonly  accepted 
both  in  censuses  and  samples  for  diverse  types  of  nonre- 
sponse  and  for  errors  of  response  and  observation  discussed 
below. 

Fourth,  the  effects  due  to  the  obsolescence  of  censuses 
should  be  considered  separately,  though  they  seldom  are,  and 
also  were  largely  neglected  in  this  conference.  They  will 
be  mentioned  briefly  later  in  connection  with  censal  and 
postcensal  estimates. 

Fifth,  we  do  not  expect  good  administrative  registers  and 
records  to  replace  the  need  for  censuses  in  the  United  States 
in  the  near  future.  However,  the  statistical  profession  remains 
neutral  concerning  their  desirability.  It  is  naive  to  believe 
that  good  population  registers  belong  to  authoritarian 
regimes.  Actually,  Scandinavia  has  the  best  registers,  and  in 
1980,  Denmark  will  no  longer  take  a  census,  Norway  will  take 
its  last  one,  and  Sweden  is  debating  their  need. 

It  is  in  this  rich  context  of  errors  of  diverse  types  that  we 
should  look  at  the  problems  of  undercounts  in  connection 
with  other  sources  of  missing  data.  My  experience  in  reading, 
listening,  and  discussions  has  been  that  a  great  deal  of  con- 
fusion and  some  controversies  arise  simply  for  lack  of  clear 
terminology  among  the  diverse  sources  of  missing  data.  We 
lack  agreement  on  the  terminology  for  missing  data  (and  for 
many  other  problem  areas),  but  I  hope  the  following  will 
cover  most  common  usage. 

TYPES  OF  MISSING  DATA,  NONOBSERVATION 

We  begin  even  here  with  problems  of  terminology.  Terms 
other  than  the  two  above  have  been  used  for  the  entire  class 
and  sometimes  "nonresponse,"  but  we  prefer  to  reserve  this 
for  one  of  the  categories  below  in  this  general  class.  Note 
that  as  we  proceed  from  1  to  4,  we  descend  from  high  levels 
of  knowledge  about  the  missing  units  to  low  levels,  as  our 
ignorance  increases. 
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1.  Item  nonresponse,  not  ascertained  (NA),  or  not  stated 
refer  to  specific  variables:  answers  missing  for  an  ele- 
ment (individual)  for  whom  other  observations  have 
been  accepted.  Reasons  for  NA  may  be  numerous: 
Refusal  or  incapacity  on  part  of  the  respondent, 
omissions  on  part  of  the  enumerator,  or  unusable  or 
invalid  answers  that  are  voided  in  the  editing  process. 
This  last  item  links  the  problems  of  missing  units  to 
errors  in  observations. 

2.  Total  nonresponse,  or  simply  nonresponse,  may  be  due 
to  refusals;  to  not-at-homes  (NAH)  after  appropriate 
attempts;  inabilities  of  various  kinds;  not  located, 
which  depends  on  specific  procedures;  or  lost  or 
destroyed  schedules.  (We  heard  here  about  the  many 
schedules  "chewed  up"  by  the  machines.)  The  total 
nonresponse  may  refer  to  the  individual  element 
(person)  or  to  its  group  (household). 

3.  Cluster  nonresponse  refers  to  observations  missing  for 
larger  units,  such  as  entire  areas  or  establishments, 
due  to  refusals,  inaccessibility,  absence  of  respondent, 
etc.  Its  sources,  occurrences,  effects,  and  treatment 
differ  from  the  others. 

4.  Noncoverage  or  incomplete  frame  denotes  failure  to 

include  some  units  of  the  defined  population(s)  in  the 
actual  operational  frame.  The  failures  come  from 
faulty  preparation  of  the  frame,  from  faulty  execution 
of  enumeration  procedures,  and  from  faulty  responses. 
The  net  noncoverage  refers  to  the  excess  of  non- 
coverage  over  overcoverage,  and  the  gross  coverage 
error  can  refer  to  the  sum  of  the  absolute  values  of  the 
two  coverage  errors.  The  census  undercount  is  a  special 
case  of  net  undercoverage. 

5.  Deliberate  and  explicit  exclusions  of  sections  of  the 
population  differ  from  noncoverage,  also  from  non- 
response.  Their  sources,  effects,  and  treatments  are 
all  different  from  those  of  the  other  four  forms. 

For  item  nonresponse,  evidence  from  the  field  indicates 
existence  of  individuals,  plus  some  (usually  most)  of 
their  characteristics.  From  relationships  with  these  vari- 
ables, the  missing  item  is  imputed  with  some  (though  not 
total)  confidence  and  objectivity.  For  total  nonresponse, 
only  the  existence  of  the  unit  is  ascertained,  not  its  charac- 
teristics. When  the  nonresponding  unit  is  the  household, 
census  enumeration  of  numbers  of  persons  cannot  be  done 
directly,  and  those  very  numbers,  as  well  as  their  character- 
istics, are  imputed  from  "similar"  households.  Even  this 
imputation  is  based  on  field  ascertainment  of  the  existence 
of  the  household.  Hence,  all  of  these  adjustments  and  impu- 
tations have  been  properly  considered  as  belonging  to  census 
counts. 

However,  noncoverage  differs  from  the  three  classes  of 
nonresponse  above.  It  results  in  more  difficult  problems 
because  of  ignorance  of  the  very  existence  of  the  missing 
units.  The  problems  are  intractable  within  the  framework  of 


the  collection  procedures  of  the  census  enumeration.  For 
evidence  about  them,  methods  must  go  beyond  those  collec- 
tion procedures  to  other  expensive  methods,  to  models 
(demographic  and  other),  and  to  subjective  bases.  Hence, 
adjustments  for  noncoverage  should  be  considered  as 
resulting  in  estimates  rather  than  in  census  counts. 

It  seems  that  the  diverse  types  of  missing  units  also  tend 
to  have  different  sources  in  the  population.  Item  nonresponse 
tends  to  be  higher  in  populations  which  may  be  termed  "less 
developed";  in  less  developed  countries,  among  rural  and  less 
educated  groups,  and  among  nonmembers  of  the  labor  force. 
In  contrast,  total  nonresponse,  refusals,  and  NAH's  tend  to 
be  higher  among  the  cosmopolitan,  urbanized,  educated,  and 
labor-force  members.  Noncoverage,  however,  is  most  com- 
mon among  the  most  mobile,  least  settled  portions,  espe- 
cially among  young  males  (particularly  poor  young  males), 
among  migrants,  and  in  mobile  occupations.  In  the  United 
States,  it  is  especially  high  in  centers  of  large  cities  (though 
more  rural  in  many  countries),  among  blacks,  Chicanos,  and 
some  other  ethnic  groups.  The  quality  of  the  enumeration 
is  important;  also,  procedure,  training,  and  budgets.  These 
must  be  applied  separately  for  low  rates  of  noncoverage, 
nonresponse,  item  nonresponse,  callbacks,  repeat  enumera- 
tions, checks,  editing,  and  imputing. 

DIVERSE  EFFECTS  ON  DIFFERENT 
STATISTICS 

It  is  common  to  concentrate  technical  discussions  on  one 
kind  of  statistics  and.  to  neglect  others.  It  is  also  common  to 
focus  on  only  one  type  of  missing  data  and  neglect  other 
types,  even  though  the  different  types  have  diverse  effects 
on  the  different  kinds  of  statistics.  These  very  differences 
should  warn  us  to  consider  briefly  the  major  kinds  of 
statistics. 

1.  Simple  totals  denote  counts  of  persons,  households, 
etc.;  subclasses  by  age,  sex,  ethnic  category,  etc.;  also, 
sums  of  variables  beyond  simple  counts,  such  as 
acreages  and  incomes.  We  should  also  distinguish 
effects  not  only  on  national  totals  and  on  major 
domains,  but  also  in  small  local  areas  and  other  small 
domains.  The  relative  effects  of  different  kinds  of 
missing  units  and  of  undercounts  can  be  very  different 
for  different  sizes  of  domains. 

Noncoverage  has  the  effect  of  imputing  y;=0  instead 
of  the  actual,  individual  values,  and  nonresponse  has  a 
similar  effect,  if  uncompensated.  Overall  adjustments 
from  demographic  and  similar  models  can  be  made 
sometimes  reasonably  well  for  the  national  totals  but 
not  for  local  areas.  These  latter  suffer  both  from  small 
sizes,  and  also  from  lack  of  enclosed  populations; 
hence,  from  transmigrations  across  boundaries. 

2.  Ratio  means  and  averages  (means  and  medians),  includ- 
ing means  for  subclasses,  are  less  affected  than  simple 
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totals  by  noncoverage  and  nonresponse  to  the  degree 
that  these  resemble  the  response  portions.  But  they  are 
affected  to  the  extent  that  the  noncoverage  and 
nonresponse  cases  tend  to  differ  from  the  responses; 
they  suffer  from  differential  nonresponses  if  these  are 
not  adjusted  or  compensated.  This  advantage  of  ratio 
means  can  be  transferred  to  simple  totals  by  ratio 
estimation,  but  only  to  the  extent  that  the  size  of  the 
noncovered  population  is  known  or  can  be  estimated. 
And  this  is  especially  difficult  to  do  for  small  areas. 
3.  Multivariate  relations  tend  to  be  affected  differently 
than  ratio  means  and  simple  totals.  Here  we  include 
common  comparisons  between  subclass  means,  as  well 
as  more  complex  analytical  statistics,  such  as  regres- 
sions. Like  ratio  means,  relations  tend  not  to  be 
affected  by  "average"  kinds  of  nonresponse  or  non- 
coverage.  And  they  tend  to  be  even  less  affected  than 
ratio  means  to  the  extent  that  differential  effects,  if 
they  exist,  tend  to  cancel  (because  effects  are  of  the 
"additive"  kind).  These  are  useful  overgeneralizations 
to  which  exceptions  can  be  found. 

However,  multivariate  relations  can  suffer  greatly 
from  relatively  small  portions  of  item  nonresponses 
on  each  of  the  several  variables  involved  in  the 
multivariate  relations;  those  small  portions  tend  to 
cumulate  toward  their  sum.  Thus,  omitting  even  small 
portions  of  NA's  can  become  damaging  to  multivariate 
statistics;  hence,  imputation  may  be  preferred. 

METHODS  OF  ADJUSTMENT  FOR  THE 
CENSUS  UNDERCOUNTS 

These  raise  special  difficulties  that  are  different  from 
those  of  the  other  types  of  missing  units  noted  in  section  2. 
Item  nonresponse  occurs  usually  in  situations  where  a  great 
deal  of  information  is  available  for  the  individual.  Procedures 
can  be  designed,  similar  to  those  for  editing  faulty  data, 
which  permit  imputing  more  or  less  reasonable  values  for  the 
missing  data.  For  total  nonresponse,  only  the  existence  of  the 
unit,  either  the  person  or  the  household,  is  ascertained.  The 
values  for  the  unit  are  then  imputed  with  more  or  less 
success.  For  household  nonresponse,  even  the  number  of 
persons  is  imputed. 

For  noncoverage,  the  problems  are  more  difficult  because 
the  number  of  units,  persons,  and  households  are  themselves 
unknown;  thus,  the  difficulty  is  greater  than  for  nonre- 
sponse. Even  this  can  be  done  reasonably  well  if  good 
auxiliary  variables  are  available,  as  with  poststratification  for 
the  CPS  and  other  samples.  Those  estimates  rely  on  vital 
registry  data  based  on  the  last  decennial  census. 

But  undercounts  in  the  census  itself  raise  special  problems 
because  they  need  that  very  outside  support  of  the  decennial 
census,  which  it  must  provide  both  for  itself  and  for  other 
samples.  Yet  from  reasonable  models,  which  must  depend  on 
stability  in  transmigrations  and  in  life  processes,  reasonable 


adjustments  are  feasible  for  national  totals  and  for  larger 
domains.  However,  for  small  domains  and  for  local  areas,  the 
problems  become  much  more  difficult.  Nevertheless,  meth- 
ods for  small  domains  estimation  do  exist,  have  been  im- 
proved recently,  and  can  be  applied  here. 

Adjustments  of  the  census  data  require  two  phases.  First, 
information  must  be  obtained  from  sources  outside  the 
census  itself.  Clearly,  this  must  be  difficult  and  perhaps 
expensive;  otherwise,  the  data  would  be  collected  from  the 
census  itself.  There  are  three  major  methods  which  may  be 
alternatives  or  used  in  combination  for  more  reliability. 

1.  Demographic  analysis  depends  mainly  on  data  from 
past  censuses  and  from  vital  registers;  also  on  models 
of  stability  in  vital  rates,  and  of  large  changes  in 
missing  rates  as  persons  age. 

2.  Postenumeration  surveys  attempt  to  find  larger 
and/or  different  portions  of  the  population— an 
assumption  that  cannot  be  verified  entirely. 

3.  Checks  of  diverse  administrative  records  and  registers 
can  be  made  alone,  or  as  part  of  a  dual  (or  triple) 
records  system  [4] . 

Second,  since  the  information  obtained  in  the  first  phase 
would  pertain  only  to  major  classes  (demographic,  ethnic, 
sex,  etc.),  it  needs  to  be  distributed  to  local  areas  and  to  other 
small  domains,  where  it  is  most  needed.  This  can  be  done 
with  several  methods  which  are  generally  known  as  "small 
domain  estimation"  or  "postcensal  estimates."  They  are  of 
several  types  [6,  7] ,  and  some  of  these  may  also  be  useful 
for  distributing  to  local  areas  information  obtained  on  a 
national  or  large  scale.  (1)  Symptomatic  accounting  techni- 
ques may  be  used  that  utilize  current  data  from  administra- 
tive registers  in  combination  with  their  statistical  relation- 
ships based  on  earlier  census  data.  Techniques  using  diverse 
registers  have  been  developed  by  the  Census  Bureau.  (2) 
Synthetic  estimation  (presented  by  Hill  in  this  publication) 
refers  to  ratio  estimates  combining  data  from  recent  samples 
with  census  data  for  small  areas.  (3)  Regression  methods 
(given  by  Ericksen,  also  in  this  publication)  use  multiple 
regressions  of  census  counts  on  data  from  registers  with  post- 
censal data  from  registers  and  from  samples.  (4)  Bayesian 
methods  can  have  great  variety  and  flexibility  that  depend 
on  subjective  choices  of  models  and  parameters.  (5)  Iterative 
proportional  fitting  refers  to  flexible  methods  of  categorical 
data  analysis  utilizing  the  strength  of  modern  computers.  An 
allocation  structure  establishes  relationships  between  census 
data  and  associated  variables  for  cells  of  small  domains. 
Then  an  allocation  structure  of  the  associated  variables  for 
various  marginal  summations  is  used  to  readjust  the  data  in 
the  small  cells.  It  allows  for  much  more  flexibility  than  the 
assumptions  of  linear  relations  in  either  synthetic  or  regres- 
sion methods. 

We  must  note  that  both  phases— first,  getting  data  and 

models  for  types,  classes,  and  categories  of  missing  persons, 
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and  then  distributing  them  to  small  domains— involve  models 
and  subjective  judgments.  The  extent  and  kind  of  subjec- 
tivism involved  can  be  more  or  less  and  differs  somewhat 
from  that  involved  in  imputing  and  editing  for  other  kinds 
of  missing  and  faulty  data.  The  differences  are  not  absolute, 
but  they  are  not  trivial  either.  (Perhaps  these  differences 
resemble  those  between  "direct"  and  "circumstantial" 
evidence.) 

POLICY  DECISION  ABOUT  ADJUSTMENTS 
FOR  CENSUS  UNDERCOUNTS 

Policy  decisions  concerning  adjustments  involve  questions 
both  of  statistical  principles  and  public  policy.  Policy  ques- 
tions arise  from  the  subjective  nature  of  adjustments.  Be- 
cause they  are  subjective,  they  could  be  matters  for  litigation 
and  even  for  abuse,  which  could  destroy  the  integrity  or  at 
least  the  credibility  of  an  independent,  objective,  national 
statistical  service. 

Clearly,  there  are  judgments  involved  in  decennial  cen- 
suses. The  choice  of  10  years  for  census  periods,  instead  of 
either  1  or  100,  for  example,  depends  on  unstated  models 
about  the  stability  of  census  variables.  So  does  the  census 
date  of  April  1,  which  has  effects  on  distributions  of  popu- 
lations, both  de  facto  and  de  jure.  These  models  of  fluctua- 
tions should  be  explored  elsewhere  in  terms  of  models 
[2,3]. 

Three  decisions  concerning  adjustments  have  large  policy 
components. 

1.  Should  there  be  an  official  adjusted  figure?  Of  course, 
there  are  unofficial  adjustments;  likewise,  continued 
use  of  an  unadjusted  decennial  census  for  up  to  14 
years  later  is  also  a  decision. 

2.  Where  should  the  office  for  adjustments  be  located? 
In  the  Census  Bureau  or  elsewhere? 

3.  How  far  should  adjustments  go?  There  seem  to  be  sev- 
eral criteria  (as  usual)  contending  for  primary  attention. 
There  are  simplicity,  understandability,  and  objec- 
tivity. Then  there  are  "equity,"  in  some  sense,  and 
"accuracy,"  which  may  be  put  as  some  average  of  some 
error  function,  perhaps  a  mean  squared  error.  The 
former  set  may  be  in  conflict  with  the  latter;  that  is, 
objective,  simple  adjustments  may  be  small  and  not 
bring  as  much  equity  and  accuracy  as  subjective,  more 
complex  adjustments. 

It  seems  that  possible  decisions  concerning  the  census 
undercount  may  be  placed  roughly  into  four  levels  of  action. 

1 .  Accept  the  data  essentially  as  obtained  in  the  field  by 
the  enumerators  after  reasonable  supervision  and 
editing  in  the  field. 

2.  Add  the  Census  Bureau's  present  methods  for  editing, 
processing,  checking,  and  imputing. 


3.  Adjust  according  to  an  agreed  "convention"  (as 
Keyfitz  puts  it  well)  to  bring  national  and  domain 
estimates  up  to  the  best  available  model  for  the 
undercount. 

4.  Further  investigate  and  design  methods  of  imputation 
for  improved  adjustments  for  multivariate  as  well  as 
univariate  relations. 

Four  levels  may  be  too  many.  Probably  we  need  not  go 
back  to  the  purity  of  level  1  and  we  may  temporarily  con- 
sider 4  only  for  methodological  investigations.  However, 
both  levels  2  and  3  can  be  made  available  for  diverse 
purposes. 

I  hope  the  following  proposals  have  wide  acceptance  at 
the  conference  and  outside  as  well.  They  also  accord  well 
with  decisions  in  Australia. 

1.  The  methods  currently  used  and  planned  (level  2) 
will  be  designated  as  the  decennial  census  counts. 

2.  Censal  estimates  will  also  be  computed  in  accord  with 
estimates  of  the  undercoverage  and  based  on  a 
convention  [1]  agreed  to  independently  of  expected 
results. 

These  estimates  should  be  kept  distinct  in  name  and 
concept  from  census  counts  as  noted  above.  One 
proposal  (by  Roberts  in  this  publication)  is  that  esti- 
mates be  released  for  the  undercounts  only,  not  for  the 
adjusted  populations  [1].  That  interesting  proposal, 
however,  would  meet  practical  obstacles,  especially 
in  computing  multivariate  adjustments. 

The  estimates  would  need  to  bring  some  compro- 
mise between  adjusting  for  a  good  part  of  the  under- 
count and  simple  acceptability. 

3.  The  locus  and  responsibility  for  the  estimates  should 

be  assigned  to  an  estimating  office  that  would  maintain 
statistical,  scientific,  and  ethical  independence  from 
both  the  users  and  the  producers  of  statistics.  Probably 
a  separate  office  in  the  Census  Bureau  would  be  best, 
perhaps  with  an  impartial,  prestigious  advising  and 
overseeing  board. 

The  same  office  could  also  be  responsible  for  post- 
censal  estimates  and  even  for  projections  for  popula- 
tions. It  may  be  wise  to  link  the  censal  estimates  to 
postcensal  estimates  in  methods  and  in  conception  and, 
for  continuity. 

4.  Decisions  to  use  either  the  census  counts  or  the  censal 
estimates  would  rest  with  public  decision  bodies.  It  is 
possible  that  some,  like  apportionment  bodies,  would 
choose  the  census  counts.  But  for  other  purposes 
(probably  for  revenue  sharing)  the  censal  estimates 
would  be  chosen.  These  offices  may  also  prefer  post- 
censal estimates  to  decennial  censal  estimates.  Other 
offices  (health,  energy,  etc.)  would  make  separate 
choices. 

Furthermore,  some  public  and  private  bodies  would 
prefer  some  other  estimates,  perhaps  some  that  are  less 
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"safe,"  or  less  objective  but  more  likely  to  come  closer 
to  the  actual,  current  population.  For  these,  as  well  as 
for  the  sake  of  future  improvements,  we  hope  that 
investigations  will  continue  to  be  pursued  vigorously. 

It  is  important  to  emphasize  again,  as  some  speakers  have 
done,  that  there  is  no  single,  stable  aim  with  specified  toler- 
ance limit  for  population  estimates.  The  aims  are  multi- 
purpose and  multisubject,  flexible  and  changing,  and  desired 
precisions  are  only  vaguely  felt  and  subject  to  compromise— as 
is  usual  for  actual  statistical  and  sampling  designs. 
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SUMMARY 

In  this  paper  we  consider  methods  for  the  estimation  of 
census  undercount  for  subgroups  of  the  population,  with 
particular  reference  to  small  geographic  areas.  More  specifi- 
cally, an  intensive  analysis  of  a  postenumeration  survey 
(PES)  is  seen  as  potentially  very  informative.  Empirical  Bayes 
analysis  of  logistic  models  with  random  effects  opens  up  a 
wide  range  of  models  which  a  priori  seem  to  reflect  the 
inherent  structure  in  a  complex  PES  and,  in  addition,  could 
lead  to  improved  estimates  of  census  undercount  for  small 
subgroups.  A  Bayesian  analogue  to  the  simple  ratio- 
expansion  technique  for  extrapolating  from  the  PES 
estimates  to  the  population  using  census  data  is  presented, 
and  the  extent  of  uncertainty  in  the  estimates  obtained  is 
seen  as  being  available  through  their  approximate  posterior 
variances.  Finally,  some  comments  are  made  with  regard  to 
the  implications  of  these  proposals  on  the  design  of  a  PES. 

INTRODUCTION 

The  national  census  of  population  is  a  basic  data  source 
for  allocating  legislative  representation,  for  redistributing  tax 
revenues,  for  economic  and  social  planning  by  businesses  and 
government  agencies,  and  for  research  studies  of  many  kinds. 
Since  a  census  is  by  definition  a  complete  enumeration,  it  is  a 
paradoxical  fact  that  one  of  the  most  challenging  aspects  of 
census  evaluation  is  to  assess  the  extent  and  distribution  of 
the  inevitable  incompleteness  of  the  actual  enumeration. 
Accuracy  for  local  areas  and  special  subgroups  of  the 
population  is  becoming  increasingly  important,  and  this  in 
turn  creates  pressure  to  improve  tools  and  methodology  for 
estimating  the  census  undercount  of  such  areas  and  sub- 
groups. In  this  paper  we  concentrate  on  potential  improve- 
ments which  may  be  achieved  through  detailed  analysis  and 
modeling  of  a  PES. 

Undercount  may  be  assessed  either  by  demographic 
analysis  or  by  matching  studies,  the  latter  category  including 
PES  methodology.  Demographic  analysis  [23]  employs 
aggregate  data  from  sources  external  to  the  census  as  input 
to  models  which  predict  the  population  on  census  day.  The 
external  sources  include  birth  and  death  registrations,  immi- 
gration and  emigration  statistics,  previous  censuses,  and 
administrative  records  such  as  Social  Security  and  Medicare 
files.  While  effective  at  a  national  level,  demographic  analysis 
for  subnational  areas  must  rely  on  internal  migration  infor- 
mation drawn  from  the  census  itself.  As  noted  by  Siegel  eta/. 


[23] ,  the  necessary  migration  data  for  subdivisions  of  a  State 
are  lacking. 

Matching  studies  require  external  data  collected  close  in 
time  to  the  census  date.  Individuals  picked  up  in  the  external 
data  are  compared  against  appropriate  census  categories,  and 
those  missing  from  the  census  are  identified  as  contributing 
to  the  undercount.  The  proportion  missed  in  the  matched 
sample  is  then  extrapolated  to  the  census  to  provide 
estimates  of  census  undercount.  Matching  can  be  based  on 
observational  sources  such  as  administrative  records,  but  a 
well  executed  PES  based  on  a  probability  sample  can  provide 
important  scientific  advantages  in  terms  of  data  quality, 
control  of  coverage,  and  use  of  the  randomization  hypothesis 
for  statistical  inference.  The  U.S.  Bureau  of  the  Census  [22] 
has  conducted  matching  studies  after  each  census  since 
1950,  including  sample  surveys  as  a  major  component 
of  the  matching  studies  in  1950  and  1970.  Undercount 
assessment  for  the  1976  Canadian  Census  of  Population  and 
Housing  emphasized  PES  methodology,  as  described  by 
Theroux  and  Gosselin  [20,  21  ] . 

To  be  worthwhile,  a  PES  must  avoid  the  undercount  prob- 
lems of  the  census  itself,  both  overall  and  over  important 
subgroups.  The  PES  therefore  requires  intensive  and  expensive 
search  procedures,  in  principle,  affordable  because  the  PES  is 
a  tiny  fraction  of  the  census.  We  are  not  concerned  in  this 
paper  with  techniques  for  assuring  an  accurate  PES.  Our 
analysis  is  concerned  solely  with  what  can  be  learned  from 
an  accurate  PES.  In  practice,  it  would  be  necessary  to 
assess  the  effects  of  undercount  in  the  PES  on  the  resultant 
assessment  of  census  undercount,  but  we  do  not  present  here 
formal  tools  for  such  second-order  undersount  assessment. 

We  primarily  address  the  problem  of  inferring  from  a 
thinly  spread  PES  to  local  areas  which  need  not  be  included 
in  the  survey.  A  typical  PES  will  be  a  multistage  survey  with 
a  nested  strucfture  of  primary  sampling  units  (PSU's), 
secondary  sampling  units  (SSU's)  within  PSU's,  tertiary 
sampling  units  (TSU's)  within  SSU's,  and,  finally,  households 
within  TSU's.  For  example,  we  might  need  to  make  a 
statement  about  the  census  undercount  of  teenage  black 
males  in  a  local  area  SSU  within  a  county  PSU  not  included 
in  the  postenumeration  survey.  Since  such  assessments  can 
only  be  made  with  uncertainty,  we  suggest  that  approximate 
Bayesian    posterior   distributions  are   appropriate  reporting 
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mechanisms.  Our  posteriors  are  roughly  described  by 
posterior  means  and  posterior  variances. 

To  carry  out  Bayesian  analysis  requires  detailed  models. 
To  find  appropriate  models  requires  extensive  data  analysis 
of  the  results  of  the  PES.  The  principles  required  for  such 
modeling  are  all  well  known,  and  the  required  technology  is 
feasible  but  by  no  means  completely  developed  at  present. 
Part  of  our  goal  is  to  outline  the  required  development  of 
these  statistical  techniques. 

In  section  2,  we  introduce  the  basic  foundation  of  logistic 
models  which  permit  representation  of  the  probability  that 
an  individual  will  be  captured  in  a  census  as  a  function  of  his 
or  her  characteristics  and  place  of  residence.  We  also 
introduce  the  basic  ratio-expansion  device  whereby  a  fitted 
probability  for  each  individual  observed  in  the  census  can  be 
used  to  create  an  estimate  of  the  number  of  missing 
individuals  in  specific  categories.  These  simple  modeling  and 
estimation  methods  are  not  adequate  to  produce  posterior 
distribution  from  a  complex  PES,  so  we  proceed  to  specify 
random  effects  in  the  logistic  models  which  allow  more 
realistic  assessments  of  posterior  variability  in  fitted  prob- 
abilities. Also,  we  show  how  to  extend  the  ratio-expansion 
estimate  of  undercount  to  an  approximate  posterior  dis- 
tribution of  undercount.  Finally,  we  comment  on  some  of 
the  implications  of  our  modeling  framework  on  the  design 
and  analysis  of  a  PES. 

BASIC  TECHNIQUES 

The  analysis  of  PES  data  is  directed  towards  estimating 
both  an  overall  undercount  rate  and  the  dependence  of 
undercount  rate  on  factors  associated  with  individuals, 
households,  and  geographic  locations.  Techniques  for 
assessing  variation  in  undercount  rate  involve  models,  either 
explicitly  or  implicitly.  We  make  our  models  explicit  by 
postulating  an  idealized  probability  of  being  missed  for  each 
individual  in  the  population  and  then  using  formal  mathe- 
matical models  which  represent  this  probability  as  a  function 
of  factors  associated  with  each  individual.  Logistic  multiple 
regression  [4]  provides  a  large  and  flexible  class  of 
models  for  the  representation  of  individual  probabilities. 
Here,  we  limit  discussion  to  standard  logistic  models  with 
fixed  effects  only,  while  later  we  make  the  necessary 
extension  to  random  effects,  which  permits  representation  of 
intraclass  correlation  within  households  or  larger  units. 
Another  potentially  important  extension,  to  include 
modeling  the  probability  of  missing  a  whole  household  unit, 
is  mentioned  later. 

Currently  available  estimation  techniques  for  small 
domains  have  been  reviewed  recently  by  Purcell  and  Kish 
[17].  Although  often  applied  to  estimating  counts  such  as 
unemployment  or  mortality  statistics,  most  of  the  available 
techniques  were  designed  primarily  for  continuous  variables. 
Examples  include  the  regression  models  of  Ericksen  [8]  , 
the    synthetic    estimation    techniques    of    Gonzalez    and 


Hoza  [10],  and  the  prediction  models  of  Holt,  Smith,  and 
Tomberlin  [12]  and  Laake  [13].  Purcell  and  Kish  note 
that  "a  categorical  data  analysis  framework  appears  as  a 
logical  approach,  but  it  has  received  little  attention  to  date." 
Contingency  table  methods  have  been  used  mainly  for 
rescaling  survey  data  to  match  margins  taken  from  an 
external  data  source,  such  as  approximate  population 
marginal  totals.  The  basic  technique  is  the  iterative  propor- 
tional fitting  method  of  Deming  and  Stephan  [5] .  Purcell 
[16]  defines  a  general  approach  including  the  raking  ratio 
methods  of  Brackstone  and  Rao  [1]  and  the  work  of 
Chambers  and  Feeney  [2]  ,  who  adjust  aggregate  survey 
data  to  match  externally  obtained  small  area  characteristics. 
Available  small  area  techniques  do  not  apply  to  the  census 
undercount  problem,  where  the  essence  is  the  incompleteness 

of  the  external  data  source,  i.e.,  the  census. 

Here,  we  rely  completely  on  techniques  designed  for 
counted  data.  The  first  part  of  the  discussion  describes 
logistic  models  which  lead  to  estimates  of  undercount  rates 
for  subgroups  of  the  population.  We  then  proceed  to  define 
straightforward  ratio-expansion  techniques  which  adjust 
census  counts  upwards  to  reflect  undercount  rates  estimated 
from  the  PES,  either  for  the  whole  census  or  for  subgroups. 

We  use  the  symbol  q  with  appropriate  subscripts  to 
represent  the  probability  that  an  individual  is  missed  in  the 
census,  and  p  =  1  —  q  to  denote  the  complementary  prob- 
ability of  being  counted.  The  subscripts  attached  to  p  and  q 
define  levels  of  factors  which  affect p  and  q.  For  purposes  of 
illustration,  we  will  assume  that  categories  are  defined  for  sex, 
age  groups,  and  race  groups  represented  by  subscripts  u,  v, 
and  w,  respectively,  and  we  will  represent  the  triple  (u,v,w) 
by  the  single  symbol  /u  for  convenience.  We  will  also  assume 
that  the  symbol  v  denotes  ij,k,/,m,  where  /  represents  PSU,/ 
represents  SSU  within  PSU,  k  represents  TSU  within  SSU,  / 
represents  household  within  TSU,  and  m  represents  an 
individual  within  a  household. 

A  typical  logistic  model  might  assume  the  mathematical 
form 
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where  the  logit  function  is  defined  by 

w,w.ln(-rH)iin(f)  ,2> 

Note  that 

j 

t(qr)  (3) 
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The  subscript  n  on  6  in  model  (1)  indicates  that  the  logit  is 
allowed  to  depend  on  the  sex,  age,  race  combination  defined 
by  ji  =  (u,v,w).  No  local  area  effect  appears  in  model  (1), 
indicating  that  any  variation  in  undercounts  between  areas 
would  be  due  to  variation  in  the  local  area  distributions 
across  sex -age-race  classes. 


90 


By  introducing  a  local  area  effect,  the  model  can  be 
improved  to  allow  for  variation  in  undercount  rates  beyond 
that  which  is  due  to  differences  in  sex-age-race  population 
distributions.  One  such  model  would  be 


logitlp^)^  +em 


(4) 


Here,  the  geographic  parameter  0y/i  depends  only  on  the/th 
SSU  within  the  /th  PSU.  The  additive  form  9  +  0  -^  means 
that  no  interaction  is  permitted  between  the  sex-age-race 
effect  9     and  the  local  area  effect  0  •//!■ 

A  more  realistic  model  might  permit  the  race  effect  to 
depend  on  the  local  area,  so  the  term  <t>  ;/j\  appearing  in 
model  (4)  could  be  replaced  by  <t>wj(j\,  so  that  model  (4) 
would  become 
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Both  models  (4)  and  (5)  predict  that  the  undercount  rate  will 
vary  with  racial  composition,  but  model  (5)  bases  the 
estimate  of  such  variation  on  each  local  area,  while  model 
(4),  like  model  (1),  uses  an  aggregation  of  local  areas  to 
predict  the  effects  of  varying  racial  composition. 

Models  (4)  and  (5)  have  the  advantage  of  increasing  real- 
ism, but  have  the  disadvantage  that  only  data  within  the  local 
area  j(i)  can  be  used  to  estimate  the  local  area  effect  0  .y.-i  in 
model  (4)  and  the  local  area-race  interaction  effect  0  w-.n\  in 
model  (5).  Furthermore,  for  either  of  models  (4)  or  (5),  the 
possibility  of  estimation  for  local  areas  j(i)  not  included  in 
the  PES  is  not  immediately  obvious.  A  major  purpose  of  the 
random  effects  models,  which  we  introduce  in  section  3,  is  to 
permit  a  compromise  between  model  (1)  and  models  (4)  and 
(5),  so  that  local  data  can  be  used  to  an  appropriate  extent 
when  available. 

Another  device  for  improving  logistic  models  is  to 
introduce  covariates  into  the  model.  For  example,  if  X-/:i  is 
a  measure  of  the  wealth  of  local  area  j(i),  then  model  (5) 
might  be  revised  to 


><*»  V"V^x/fl/ 
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which  permits  the  influence  of  race  on  undercount  to  depend 
on  the  wealth  of  a  local  area,  and  simultaneously  improves 
the  usefulness  of  the  model  relative  to  model  (5)  because  it 
applies  to  any  local  area  for  which  X.-/yi  is  available, 
including  areas  not  in  the  PES. 

Techniques  for  estimating  parameters  in  models  such  as 
(1),  (4),  (5),  and  (6)  generally  require  iterative  computation, 
but  computer  programs  are  widely  available.  The  standard 
methods  produce  estimates  which  are  maximum  likelihood 
under  the  assumption  that  the  individuals  in  the  PES  are 
enumerated  or  not  according  to  independent  binomial 
drawings  with  probabilities  p  and  quv.  Even  though  the 
independence  assumption  is  generally  false,  the  estimates  will 


be  reasonably  good  with  moderately  large  samples.  The 
reason  for  the  failure  of  the  independent  binomial  assump- 
tion is  that  the  fixed  effects  models  such  as  (1),  (4),  (5),  and 
(6)  cannot  incorporate  the  intraclass  correlation  among/?  Ul> 
within  a  household  or  TSU,  or  perhaps  even  within  an  SSU, 
without  suffering  too  much  degradation  of  accuracy  due  to 
small  sample  size.  It  follows  that  another  major  reason  for 
going  to  random  effects  models  later  is  to  permit  credible 
likelihood  analysis  and  thence  credible  Bayesian  inferences 
about  undercount  rates. 

Even  without  the  complication  caused  by  the  inappropri- 
ateness  of  the  independent  binomial  sampling  assumption,  it 
is  not  possible  to  estimate  the  mean  square  error  of  estimates 
provided  by  reduced  models  such  as  those  discussed  in  this 
section.  Variance  estimates  depend  on  the  assumed  model 
being  true  and  do  not  reflect  the  bias  which  is  inherent  in 
estimates  based  on  models  with  reduced  numbers  of  para- 
meters. Gonzalez  and  Waksberg  [11]  consider  an  ad  hoc 
method  for  obtaining  a  measure  of  the  mean  square  error  for 
their  synthetic  estimates.  Holt,  Smith,  and  Tomberlin  [12] 
derive  mathematical  expressions  for  the  total  mean  square 
error  of  the  predictive  estimates  obtained  from  the  analysis 
of  variance  type  models,  but  they  do  not  suggest  how  one 
would  estimate  these  measures.  As  we  have  shown  later, 
the  posterior  variances  available  when  more  complete, 
random-effects  models  are  used  provide  measures  of  the 
reliability  of  undercount  estimates. 

The  choice  of  a  specific  logistic  model  depends  partly  on 
what  is  plausible  a  priori  and  also  on  what  appears  to  fit  the 
PES  data.  That  is,  extensive  data  analysis  should  be  per- 
formed, trying  many  different  models  before  narrowing  the 
choice  to  models  which  appear  to  fit  adequately.  One  way  to 
judge  adequacy  of  fit  is  to  look  at  differences  between  actual 
undercount  rates  and  fitted  undercount  rates  for  subgroups 
of  the  PES,  and  to  judge  whether  the  differences  are  large 
enough  to  affect  end  uses  of  the  adjusted  census  figures.  The 
other  way  to  judge  fit  is  through  significance  tests.  Signifi- 
cance tests  are  difficult  to  use  in  the  present  instance  because 
the  only  available  tests  rely  on  the  independent  binomial 
model,  which  is  not  valid.  Reliable  goodness-of-fit  tests  may 
be  feasible  in  the  context  of  the  random  effects  models 
shown  later,  but  the  required  techniques  have  not  yet  been 

developed. 

We  now  turn  to  a  discussion  of  the  application  of  PES 

estimates  to  census  counts.  The  main  points  can  be  illus- 
trated by  example.  Suppose  it  is  desired  to  estimate  the 
undercount  for  a  particular  sex-age-race  group  ju  =  u,v,w 
within  a  particular  SSU  indexed  by  iff),  and  suppose  that 
model  (5)  has  been  fitted  to  PES  data.  Several  cases  require 
separate  treatment.  If  the  SSU  j'(l')  is  included  in  the  PES, 
then  p  from  model  (5)  is  directly  fitted  from  the  PES,  and 
the  undercount  may  be  directly  estimated  from 
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wnere  P  ixv  and  ^  \xv  =  1  ~  P  lxv  come  f rom  the  f|tted  model 
and  Q  njfi)  is  tne  census  count  in  the  subgroup  identified  by 
jU  and  j(i).  Here  we  make  use  of  the  simple  ratio  estimate  as 
described  by  Cochran  [3] ,  among  others.  If  the  /th  PSU  is 
included  in  the  PES,  but  SSU  j(i)  is  not  sampled,  then  it  is 
necessary  to  replace  0  -^  in  the  fitted  model  (4)  by  an 
average  over  the  fitted  values  obtained  for  SSU's  j(i)  sampled 
within  the  /th  PSU,  so  that  a  fitted  p  is  obtained  and 
model  (7)  can  be  used.  Finally,  if  the  /'th  PSU  is  omitted  from 
the  PES,  it  is  necessary  to  replace  <p  -a-,  in  the  fitted  model 
(5)  by  an  average  over  sampled  PSU's. 

Two  further  points  may  be  noted.  First,  to  obtain 
estimates  of  undercount  for  aggregated  subgroups,  the 
principle  is  to  estimate  for  smallest  subgroups  for  which  the 
model  provides  fitted  p  and  then  aggregate  the  estimated 
undercounts.  For  example,  suppose  we  fit  model  (1),  where 
the  logits  depend  only  on  the  sex-age-race  class  li.  Using 
maximum  likelihood  techniques,  we  can  easily  show  that 

" uv 
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Here,  u  anc)  c  are  the  numbers  of  missed  and  counted 
individuals  in  the  whole  sample  who  are  members  of  the 
sex-age-race  class  ix.  The  undercount  for  the  /th  PSU 
(whether  or  not  it  is  in  the  PES)  is  estimated  by 
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where  C  •  is  the  census  count  for  the  subgroup  ix  in  the  /th 
PSU.  It  should  be  noted  that  this  model  is  very  similar  to 
model  III  of  Holt,  Smith,  and  Tomberlin  [12],  and,  like 
that  model,  it  leads  to  an  estimator  very  similar  to  the 
synthetic  estimator  of  Gonzalez  and  Hoza  [10] . 

Second,  a  disaggregated  estimate  such  as  model  (7)  or 
model  (9)  can  be  improved,  perhaps  only  slightly,  by  using 
the  predictive  principle  of  finite  population  estimation 
introduced  by  Royall  [19].  The  idea  is  that  actual  under- 
count is  known  precisely  for  individuals  included  in  the 
PES,  so  that  the  ratio  estimation  principle  embodied  in 
models  (7)  and  (9)  need  be  applied  only  to  individuals  in  the 
census  but  not  in  the  PES.  The  final  estimate  of  undercount 
then  comes  from  summing  the  number  of  actual  missed 
individuals  from  the  PES  and  the  estimated  number  from 
model  (7)  or  (9)  applied  to  the  non-PES  counts. 

Once  we  have  learned  to  improve  fixed-effects  logistic 
models  such  as  (1),  (4),  (5),  and  (6)  by  including  random  ef- 
fects., as  described  in  the  next  section,  we  will  be  in  a  position 
to  specify  plausible  posterior  distributions  for  the  p.  These 


posterior  distributions  in  turn  make  it  possible  to  refine  the 
crude  ratio-expansion  principle  (7)  and  (9)  so  as  to  obtain 
approximate  posterior  distributions  of  undercount,  as  we 
explain  later. 

RANDOM  EFFECTS  MODELS 

In  this  section,  we  move  towards  more  complex  and 
realistic  models  and  at  the  same  time  become  more  Bayesian 
in  outlook.  The  two  kinds  of  change  are  linked  because  the 
parameter  set  grows  rapidly  as  the  model  becomes  complex, 
and  can  only  be  managed  by  considering  the  parameters  as 
random.  Such  swarms  of  parameters  cannot  be  estimated  by 
classical  inference  methods,  but  are  amenable  to  Bayesian 
treatment.  The  required  statistical  technology  is  feasible  but 
still  in  a  rather  early  stage  of  development  for  logistic 
models.  The  estimator  here  uses  empirical  Bayes  as  defined 
by  Bobbins  [18].  Also,  our  estimator  is  similar  to  the 
James-Stein  estimator  as  described,  for  example,  by  Efron 
and  Morris  [7] . 

The  basic  idea  is  to  include  terms  in  the  logistic  model  (4) 
which  describe  variation  in  p       within  each  of  the  stages  of 


the  multistage  PES  design.  Specifically,  we  may  write 
logit  (Pilv)  -  0M  +  d>t.  +  4»m  +  +m  +  tm) 


(10) 


where   the    (f>-    are   regarded   as   drawn    from   a   N(0,   a2) 
population,  the  4> ;/,-)  from  a  N(0,  a2 )  population,  the  0w/;i 
from  a  N(0,  a2)  population,  and  the  0  ////£•!  from  a  N(0,  a2) 

population.  These  random  effects  imply  that  individuals  in  a 
PSU  have  a  common  element  entering  their  P uv,  and  the 
same  occurs  for  the  nested  classes  of  individuals  in  a  common 
SSU,  common  TSU,  and  finally  a  common  household. 

Without  further  research,   it  remains  unclear  how  accu- 
rately the  variances  o  ,  a2 ,  a2 ,  and  a2  can  be  estimated  from 

'  12         3  4 

PES  data,  nor  is  it  easy  to  see  without  repeated  analyses  of 

the  data  what  effect  different  choices  of  the  a  2  will  have  on 

final  undercount  estimates.  The  models  do,  however,  capture 
levels  of  variation  which  a  priori  judgment  alone  strongly 
suggests  must  underlie  the  PES  data. 

Some  advantages  of  the  hierarchical  random  effects  model 

(10)  were  mentioned  previously.  Once  values  of  the  a-2  are 

tentatively  adopted,  it  becomes  possible  to  introduce  cor- 
responding factors  into  the  likelihood  analysis  and  hence 
produce  approximate  normal  posterior  distributions  for  the 
logit  (p  ),  which  automatically  and  correctly  weight 
undercount  frequencies  observed  at  the  various  levels  of  the 
multistage  design.  For  example,  the  posterior  mean  of  logit 
(p  )  for  an  individual  m(ijkl)  who  appears  in  the  PES 
automatically  uses  information  from  the  individual's  house- 
hold, TSU,  SSU,  and  PSU.  More  remarkably,  a  posterior 
mean  logit(p       )  can  be  found  for  an  individual  m(ijkl)  not 
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in  the  PES,  and  again  the  PES  counts  are  automatically 
weighted,  where  the  weighting  scheme  depends  on  which  if 
any  among  i,  j(i),  k(ij),  or  l(ijk)  appear  in  the  PES.  Similarly, 
we  can  find  posterior  variances  which  appropriately  incorpo- 
rate the  available  information  about  each  individual. 

The  basic  mathematical  development  facilitating  approxi- 
mate computation  of  the  required  posterior  means  and 
variances  appears  in  Laird  [14].  Some  initial  experience 
with  variance  estimation  is  found  in  Miao  [15].  Neither  of 
these  papers  treats  examples  of  the  degree  of  complexity 
required  for  a  real  PES,  so  that  detailed  research  and 
development  will  be  needed,  but  the  principles  are  in  place. 

Two  possible  extensions  of  the  random  effects  model  (10) 
deserve  mention.  The  first  of  these  allows  that  any  of  the  <j) 
terms  appearing  in  the  model  can  have  a  dependence  on 
characteristics  of  the  corresponding  sampling  unit,  whether  it 
be  household,  TSU,  SSU,  or  PSU.  The  predictive  accuracy 
for  local  areas  with  known  characteristics  is  enhanced  when 
the  undercount  rate  shows  dependence  on  available  charac- 
teristics. Such  a  technique  is  used,  in  essence,  by  Fay  and 
Herriot  [9]  in  the  context  of  a  standard  linear  regression 
model. 

The  second  extension  would  allow  interactions  between 
fixed  and  random  effects  in  the  model.  For  example,  the  race 
effect  might  be  postulated  to  vary  randomly  from  PSU  to 
PSU.  Such  an  interaction  random  effect  would  require  yet 
another  variance  component  in  the  model,  but  it  could  be 
important  to  allow  for  such  randomly  varying  race  effects  in 
order  to  obtain  realistic  posterior  variances  of  minority 
undercounts  in  local  areas. 

A  BAYESIAN  TREATMENT  OF 
CENSUS  UNDERCOUNT 

The  purpose  of  this  section  is  to  follow  through  the  logic 
of  Bayesian  estimation  of  undercount.  To  be  specific,  we 
consider  how  to  estimate  the  number  missed  in  a  particular 
sex-age-race  category  jx  and  a  specific  local  area  k(ij).  The 
estimate  is  defined  to  be  the  posterior  expectation  of  the 
unknown  undercount.  Since  expectation  is  a  linear  opera- 
tion, the  posterior  expectation  of  a  more  aggregated  under- 
count is  found  by  aggregating  the  corresponding 
disaggregated  posterior  expectations. 

The  PES  contributes  to  Bayesian  estimation  in  two  ways. 
The  first  way  is  trivial,  namely,  if  the  local  area  k(ij)  is  part 
of  the  PES,  then  a  certain  number  of  individuals  are 
identified  with  probability  one  as  belonging  to  the  under- 
count and  contribute  themselves  to  the  posterior  expected 
undercount  as  whole  units.  The  more  difficult  task  is  to 
estimate  the  undercount  among  the  remaining  individuals  not 
picked  up  in  the  PES,  which  is  to  say,  the  vast  majority.  For 
this  latter  purpose,  the  PES  contributes  a  posterior  distribu- 
tion of  the  unknown  p  as  discussed  previously.  In  order  to 
focus  the  attention  of  this  section  on  census  data,  we  assume 
for  most  of  the  discussion  that  the  p       undercount  prob- 


abilities are  known.  The  final  estimation  step  is  thus  to 
average  the  conditional  estimates  given  p  over  the  posterior 
distribution  of  p     . 

Suppose  that  true  population  size  is  N  =  U  +  C,  where  U  is 
the  undercount  and  C  is  the  census  count  in  subgroup  k(ij). 
Let  N  =  U  +  C  be  the  corresponding  counts  for  the 
sex-age-race  classes,  with  q  the  corresponding  undercount 
probabilities.  We  assume  that  any  individuals  picked  up  in  the 
PES  are  excluded  from  these  counts.  Note  that  we  have 
chopped  the  subscripts  k(ij)  from  p  ,  N,  U,  C,  N  ,  U  ,  and 
C    to  save  space. 

If  h(U)  denotes  the  joint  prior  density  of  the  U  ,  then 
the  posterior  density  of  U  is  proportional  to 
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Provided  the  C  are  reasonably  large,  the  joint  prior  density 
has  little  influence  and  can  be  taken  to  be  uniform,  so  that 
model  (1 1)  becomes  a  product  of  negative  binomial  densities. 
Thus,  conditional  on  the  C  and  p  ,  the  U  are  independ- 
ent negative  binomials  with  means  approximately  at  the  ratio 
expansion  value 


*.  -ft)' 


M 


(12) 


thus  connecting  Bayesian  techniques  with  standard  estima- 
tion procedures.  On  the  other  hand,  model  (11)  is  a  standard 
Bayesian  formula  appearing,  for  example,  in  Draper  and 
Guttman  [6]  .  Thus,  the  mean  of  the  posterior  distribu- 
tion of  U,  the  undercount  for  the  K(ij)th  TSU,  is  just  the  sum 
of  the  posterior  means  of  the  U  's,  i.e., 
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which  is  the  separate  ratio  expansion  as  defined  by  Cochran 

[3] ,  for  example. 

Refinements    involving  smoothing  the    C      or  weighting 

from  other  strata  may  be  needed  if  the  C    are  too  small.  In 

principle,   these    require    Bayesian    modeling   of  the  census 

counts.  No  discussion  of  such  modeling  is  included  in  this 

paper. 

To  summarize,  we  obtain  a  basic  posterior  distribution  of 

U     conditional  on  p   .  The  posterior  density  thus  obtained 

must    be    averaged    over    the    posterior    of   p  ,    which    is 

considered  fixed  in  this  development,  and  also  may  need  to 

be  averaged  over  the  posterior  distribution  of  C      if  the  C„ 

are  small  enough  to  introduce  sampling  error  comparable  to 

the  sampling  error  of  p      from  the  PES.  In  practice,  normal 

distributions    can    be    used    to    approximate    the    required 
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posterior  distributions.  If  posterior  variances  are  required  for 
the  U  and  their  aggregates,  then  approximate  normal 
covariances  are  needed  for  U  both  within  and  between 
local  areas.  Considerable  bookkeeping  is  needed  to  keep  track 
of  such  covariances,  but  they  are  not  difficult  to  compute. 


CONCLUDING  REMARKS 

We  have  argued  in  this  paper  that  a  well  designed  and 
executed  PES  is  an  effective  tool  for  assessing  disaggregated 
census  undercounts  and  may  be  the  only  trustworthy 
method  available  for  local  areas.  A  necessary  complement  to 
the  PES  data  is  careful  and  extensive  analysis  of  the  data 
directed  towards  modeling  the  relationships  between  under- 
count  rate  and  characteristics  of  individuals,  households,  and 
local  areas.  Such  modeling  is  a  necessary  prerequisite  to 
carrying  out  Bayesian  inference  aimed  at  assessing  approxi- 
mate posterior  distributions  of  undercount  for  special 
subgroups  of  the  population. 

Several  mechanisms  are  available  for  incorporating 
non-PES  information  into  the  PES  analysis.  First,  if  separate 
estimates  of  undercount  rate  are  available  for  local  areas, 
such  estimates  can  be  introduced  directly  into  a  logistic 
model,  much  as  the  variable  X  was  inserted  into  model 
(6),  except  that  the  coefficient  need  not  depend  on  ju. 
Secondly,  if  prior  information  about  undercount  rates  by 
sex-age-race  are  available  from  independent  sources  such  as 
pilot  studies  and  studies  of  another  census,  such  prior 
information  may  be  entered  directly  into  the  Bayesian 
analysis  suggested  in  section  3.  What  is  needed  is  a  way  to 
translate  such  prior  information  into  a  roughly  corresponding 
normal  prior  distribution  of  the  9  parameters  of  models 
such  as  (1),  (4),  (5),  and  (6). 

The  Bayesian  approach  suggested  here  has  implications  for 
the  design  and  execution  of  the  PES.  For  example,  our 
analyses  require  that  PES  information  be  saved  for  detailed 
analysis  down  to  the  finest  levels  of  disaggregation;  a  simple 
requirement,  but  one  apparently  not  met  for  the  1976 
Canadian  PES.  A  more  subtle  implication  is  that  the  PES 
should  be  designed  with  the  widest  possible  coverage. 
Bayesian  estimates  for  local  areas  can  be  expected  to  differ 
considerably,  depending  on  whether  and  how  extensively  the 
local  area  is  represented  in  the  PES.  If  the  local  area  is  not 
represented,  then  the  corresponding  estimates  must  depend 
entirely  on  data  from  other  local  areas,  whereas  a  represented 
local  area  can  give  weight  to  specific  data  for  the  local  area. 
The  traditional  frequency  theory  used  to  design  surveys  does 
not  come  to  terms  with  such  differences  in  posterior 
accuracy,  which  careful  Bayesian  analysis  will  demonstrate. 

Bayesian  analysis  makes  possible  a  cost/benefit  approach 

to  the  question  of  how  much  resources  should  be  devoted  to 
a  PES.  For  example,  it  should  be  possible  to  assign  an 
operationally  meaningful  cost  due  to  the  misallocation  of  tax 
funds  due  to  inaccurate  census  figures.  Given  a  posterior 
distribution    of   correct   census   figures,   one  can  define  an 


improved  allocation  system  which  will  minimize  the  mis- 
allocation  cost,  given  the  state  of  knowledge  defined  by  the 
posterior  distribution.  Hence  savings  from  the  PES  can  be 
estimated,  and  savings  from  a  larger  or  smaller  PES  can  also 
be  estimated,  leading  to  rational  comparisons  of  different 
designs.  It  would  similarly  be  possible  to  consider  tradeoffs 
between  (a)  expenditures  to  increase  census  accuracy  and  (b) 
expenditures  to  improve  census  accuracy  after  the  fact  using 
a  PES. 

Our  analysis  of  undercount  has  relied  on  the  concept  of 
the  probability  p  that  an  individual  will  be  enumerated  in 
the  census.  We  are  implicitly  combining  individuals  belonging 
to  households  which  are  contacted  with  individuals  belonging 
to  households  which  are  missed.  A  further  refinement  would 
be  to  make  separate  estimates  of  undercount  for  the  two 
classes  of  individuals.  The  analysis  for  the  first  class  would 
parallel  that  given  earlier  in  the  paper,  while  the  latter  would 
require  models  for  the  probability  of  missing  a  household 
considered  jointly  with  household  size  and  composition.  We 
do  not  attempt  to  pursue  details  here. 
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INTRODUCTION 

The  title  given  this  paper  reflects  the  author's  original 
intention  to  submit  for  the  review  of  this  conference  a 
detailed  plan  to  form  estimates  of  net  census  error  for 
counties  on  the  basis  of  data  expected  from  the  census 
evaluation  program.  The  author  has  instead  taken  a  some- 
what different  course:  To  outline  for  the  purposes  of  other 
researchers  the  basic  scope  of  the  evaluation  data;  to 
emphasize  aspects  of  the  data  that  may  impact  on  the 
question  of  small  area  estimation;  and  to  sketch  a  possible 
program  of  estimation  that  might  be  developed  to  produce 
estimates  for  counties  and  other  sub-State  areas. 

Some  general  comments  are  first  in  order.  It  should  be 
clearly  noted  that  the  focus  of  this  paper  is  on  the  technical 
issues  associated  with  the  estimation  of  net  census  error,  as 
opposed  to  the  policy  issues  arising  from  adjustment  of  the 
census  counts.  The  author  intends  that  this  paper  not  be 
construed  in  any  fashion  to  advocate  a  position  on 
adjustment. 

As  a  second  general  comment,  the  paper  proceeds  on  a 
presumption  that  there  will  be  tolerance  of  potentially 
complex  estimation  procedures,  provided  that  such  an 
approach  can  be  shown  to  have  attractive  statistical  proper- 
ties. Certainly,  others  at  this  conference  have  argued  that 
simplicity  has  particular  policy  virtues,  but  this  paper  will 
view  simplicity  as  an  unnecessary  constraint  in  forming 
estimates  that  fully  capture  the  information  given  by 
available  data. 

DESCRIPTION  OF  THE  EVALUATION  PROGRAM 

The  currently  envisioned  coverage  evaluation  program  for 
the  1980  census  comprises  three  major  projects:  A  Postenu- 
meration  Survey  (PES),  which  attempts  to  measure  all 
aspects  of  census  coverage  error  by  a  direct  household 
survey;  measurement  of  the  census  coverage  of  sample 
households  in  the  Current  Population  Survey  (CPS),  which 
attempts  to  represent  most  but  not  all  aspects  of  census 
coverage; and  an  Administrative  Record  Match  Study  (ARM), 
which  uses  a  sample  of  persons  to  measure  the  complete- 
ness of  coverage  of  the  combined  IRS-Medicare  files,  and,  by 
implication,  the  true  total  population  and  concomitant  net 
census    error.1    These    three    programs    have    already    been 


described  at  this  conference,  but  they  will  be  reviewed  here 
from  the  perspective  of  their  possible  utility  in  making 
estimates  of  net  census  error  for  relatively  small  geographic 
units  such  as  counties.  This  review  will  proceed  by  first 
making  a  number  of  observations  about  general  aspects 
common  to  all  three  before  noting  their  individual  features. 

Interlocking  Nature  of  the  Projects 

Although  the  three  parts  of  the  evaluation  program  may 
be  conceptually  separated,  they  must  be  interlocked  in  the 
estimation.  Only  the  PES  is  designed  to  stand  fully  on  its 
own  merits,  at  least  in  theory,  as  a  measure  of  net  census 
error  for  the  entire  population.  In  practice,  however,  the  PES 
may  be  the  most  deficient  in  some  aspects,  and  data  from  the 
CPS  or  ARM  may  be  used  to  remedy  its  problems.  The  CPS 
program  by  itself  fails  to  represent  the  effect  on  net  census 
error  of  erroneously  included  or  duplicated  households  in  the 
census  counts.2  The  ARM  is,  in  a  sense,  the  most  dependent 
of  the  programs,  since  its  sample  will  be  drawn  jointly  from 
the  CPS  and  PES.  In  addition,  the  ARM  will  measure  the  net 
census  error  for  the  adult  population  only. 

The  interlocking  feature  of  the  design  may  have  important 
implications  for  small  area  estimation.  For  example,  a 
potential  difficulty  arises  for  any  regression  method 
attempting  to  analyze  the  data  on  a  county  level.  Both  PES 
and  CPS  are  designed  with  a  first-stage  selection  of  counties 
in  most  States,  where  the  PES  selection  of  counties  is 
independent  of  the  CPS  selection.  Consequently,  counties 
with  CPS  sample  only  will  be  biased  by  the  omission  of  the 
components  of  overenumeration  estimated  only  in  the  PES.3 
Most  (or  all)  regression  approaches  that  have  been  considered 
in  the  literature  on  small  area  estimation  presumed  condi- 
tionally unbiased  (although  with  perhaps  high  variance) 
estimates  for  the  sampled  first-stage  units. 


2  The  E-sample  is  now  intended  for  this  purpose. 

3 Tentatively,  the  E-sample  will  be  selected  from  the  same 
first-stage  counties  as  CPS,  thus  in  large  part  eliminating  this 
difficulty. 


'The  PES  has  been  eliminated  as  an  independent  survey.  In  its 
place,  a  sample  (E-Sample)  will  be  drawn  from  the  census  to  represent 
the  components  of  net  error  not  available  from  CPS. 


The  main  text  of  this  paper  remains  essentially  as 
delivered  at  the  conference.  By  April  1980,  the 
planned  evaluation  program  radically  altered.  A  series 
of  footnotes  to  the  text  will  summarize  the  most 
important  changes  to  April  1980. 
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A  second  effect  of  the  interlocking  design  is  that  the 
complexities  of  the  estimation  will  probably  induce  a 
complex  covariance  structure  among  the  sample  estimates  for 
individual  counties  or  other  small  areas.  In  turn,  a  complex 
covariance  structure  will  increase  the  technical  difficulty  in 
fully  analyzing  regression  estimates  for  small  areas. 

Scope  of  the  Evaluation  Program 

A  second  comment  serves  only  to  caution  against  viewing 
the  PES-CPS-ARM  program  as  completely  given  at  this  date. 
As  a  consequence  of  discouraging  pretest  results,  the  merits 
of  the  PES  program  and  its  currently  projected  sample  size 
are  under  review.  Other  awaited  test  results  may  help  to 
determine  the  scope  of  the  ARM  project,  which  could  be 
undertaken  under  a  number  of  options  on  the  use  of  the  CPS 
and  PES  samples.  This  uncertainty,  although  temporary, 
affects  the  current  ability  to  plan  detailed  estimation 
procedures  at  this  time,  since  the  sample  size  and  its 
distribution  are  not  yet  fixed. 

Variance  of  the  Estimates 

A  related  comment,  with  implications  similar  to  those  of 
the  preceding  point,  is  that  the  reliability  for  a  given  sample 
size  remains  a  matter  of  speculation  at  this  point.  This 
problem  extends  across  all  three  procedures.  For  example, 
our  original  design  assumptions  for  the  PES  were  based  on  a 
presumption  of  binomial-like  variation  in  census  omissions 
affected  by  a  moderate  to  large  within-household  correlation. 
In  fact,  however,  it  appears  that  several  factors,  including 
variation  in  household  size,  interclass  correlation  between 
housing  units  on  a  block,  high  correlation  within  households, 
and  the  potentially  high  absolute  level  of  gross  omissions  and 
erroneous  enumerations  associated  with  a  more  moderate  net 
census  error,  inflate  the  sampling  variance  of  the  estimated 
net  error  by  substantial  amounts.  Sampling  variances  esti- 
mated for  the  pretest  results  were  about  6  to  20  times  those 
that  would  have  been  guessed  based  on  our  original  design 
assumptions. 

Resolution  of  Conflicting  Results 

The  coverage  evaluation  program  includes  three  separate 
projects,  partly  for  reasons  of  reducing  variance,  but  also 
partly  in  an  attempt  to  find  methods  to  reduce  bias.  This 
approach  leads  naturally  to  potentially  conflicting  results;  for 
example,  for  some  segments  of  the  population,  up  to  four 
separate  estimates  of  error  will  be  possible:  from  the  PES, 
the  April  CPS,  the  August  CPS,  and  the  ARM.4  Resolution 
of  systematic  conflicts  not  due  to  sampling  error  alone  will 
require  careful  analysis,  but  a  likely  outcome  will  be  the 
production  of  possible  alternative  sets  of  estimates. 


This  potential  conflict  transfers  directly  to  small  area 
estimation.  Much  of  the  analysis  of  the  discrepancies 
between  estimates  will  be,  of  necessity,  on  a  national  basis; 
yet,  the  critical  question  for  small  area  estimation  will  be  to 
obtain  sample  data  that  correctly  represent  differences 
among  areas  (except  for  sampling  variability).  This  require- 
ment is  especially  important  for  regression  analysis,  but 
synthetic  estimation  must  make  equally  strong  assumptions. 

Timing  of  the  Results 

Timing  of  the  results  leads  to  a  complication  similar  in 
effect  to  the  preceding  issue.  Optimistically,  preliminary 
analysis  of  the  PES  and  CPS  aspects  is  possible  by  late 
1981, 5  but  the  ARM  will  trail  more  than  a  year  behind. 
Preliminary  estimates  based  only  on  the  PES  and  CPS 
components  are  a  possible  and  perhaps  likely  outcome;  yet, 
these  initial  data  may  be  improved  by  the  results  from  the 
ARM  a  year  later.  This  point,  along  with  the  preceding,  tends 
to  suggest  that  there  will  be  more  than  one  set  of  estimates 
of  net  undercount. 

CPS  and  PES:  Character  of  the  Basic  Data 

The  major  objective  of  the  coverage  evaluation  program  is 
the  estimation  of  the  net  error  of  the  census  figures.  The 
survey  procedures  do  not  identify  specific  persons  as  "net 
missed  persons";  rather,  separate  estimates  of  gross  omissions 
from  the  census  and  erroneous  inclusions  in  the  census  are 
obtained.  Consequently,  models  that  state  the  estimation 
problem  in  terms  of  a  Bernoulli  trial— in  or  out  of  the 
census—  understate  the  complexity  of  the  estimation. 

The  complexity  is  compounded  by  the  survey  procedures 
that  must  be  used  in  practice.  For  example,  a  single  person 
may  be  treated  as  both  missed  from  the  census  and 
erroneously  enumerated  if  the  census  assigns  the  person  to  an 
enumeration  district  quite  far  from  where  the  person  lives.  In 
this  instance,  the  person  is  both  omitted  from  the  correct 
enumeration  district  and  erroneously  included  in  the  enu- 
meration district  assigned  by  the  census.  Unfortunately, 
estimation  of  the  omission  and  erroneous  inclusions  come 
from  separate  samples,  so  that  this  hypothetical  person  may 
be  possibly  included  in  the  sample  estimate  for  one  but  not 
the  other.  Such  aspects  of  the  problem  tend  to  increase  the 
difficulty  of  satisfactorily  modeling  census  coverage  at  the 
level  of  the  person. 

CPS  and  PES:  Nonlinearity 

The  estimates  of  net  census  error  from  the  CPS  and  PES 
surveys  are  nonlinear  functions  of  the  estimated  erroneous 


'  Under  the  revised  plan,  the  number  is  three. 


5  The  timing  appeared  as  a  critical  consideration  in  eliminating  the 
separate  PES.  Current  (April  1980)  projections  suggest  that  the  CPS 
results  may  be  available  by  summer   1 981 . 
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omissions  and  inclusions.  This  nonlinearity,  in  turn,  will  have 
an  insidious  effect  on  our  ability  to  aggregate  and  disaggre- 
gate the  estimates  of  net  census  error.  This  problem  will  have 
virtually  no  effect  at  high  levels  of  geography  and 
demographic  detail,  such  as  for  estimates  of  net  error  for  the 
total  population  by  State,  but  may  soon  appear  below  that 
level.  Considerable  care  will  be  required  in  using  either 
regression  or  synthetic  approaches  to  accommodate  this 
problem. 

ARM:  Complexity  of  the  Small  Area  Data 

The  basic  approach  taken  by  ARM  is  to  use  a  sample  of 
persons  matched  to  the  administrative  record  series  in  order 
to  estimate  the  rate  at  which  persons  of  different  demo- 
graphic characteristics  appear  in  the  series.  The  true  national 
population  may  then  be  inferred  from  the  administrative 
totals.  (The  actual  process  is  somewhat  more  complex  but 
conceptually  equivalent.) 

The  data  from  the  ARM  become  progressively  more 
complex  as  disaggregation  is  pursued.  At  some  level,  the 
estimation  must  incorporate  the  fact  that  the  address  present 
in  a  record  series  may  not  be  the  correct  address  for  census 
purposes.  Consequently,  the  available  data  form  a  complex 
matrix  relating  census  and  record  addresses.  Details  of  this 
estimation  remain  to  be  developed. 

Summary  of  Implications  for  Small  Area  Estimation 

The  purpose  of  the  preceding  sections  was  to  outline  some 
of  the  current  difficulties  in  the  evaluation  program  and  to 
touch  on  a  number  of  considerations  that  should  eventually 
be  incorporated  into  a  small  area  estimation  program. 
Certainly,  a  common  theme  was  the  complex  nature  of  the 
data  to  be  obtained.  Some  points,  such  as  the  nonlinearity  of 
the  estimates,  will  require  careful  treatment  in  time,  but  for 
the  interim  may  be  replaced  by  simplifying  assumptions. 
Other  aspects,  such  as  the  recognition  of  the  potentially 
complex  relationships  between  omissions  and  erroneous 
inclusions  in  the  PES  and  CPS  surveys,  appear  to  be 
prerequisites  for  the  formulation  of  applicable  models. 

Regression  approaches  to  model  county  sample  estimates 
as  the  unit  of  analysis,  and  synthetic  approaches  formulated 
in  terms  of  characteristics  of  missed  (or  erroneously 
included)  persons  represent  common  approaches  or  starting 
points  for  more  complex  procedures  that  have  frequently 
been  applied  to  the  problem  of  small  area  estimation.  The 
preceding  discussion  should  suggest  that  both  methods  will 
require  considerable  technical  care  to  be  adapted  to  this 
application.  Regression  techniques  will  be  affected  by  the 
potentially  high  level  of  the  variances,  which  may  limit  the 
complexity  of  the  model  and  the  direct  evaluation  of  the  fit 
of  the  model  on  the  basis  of  the  sample  data.  Biases  in  the 
sample  data  that  are  differential  by  geographic  area  may  be 
partially    mirrored    by   the    regression    estimates.    The   non- 


linearity  of  the  estimation  procedures  will  also  have  to  be 
taken  into  consideration  in  forming  the  sample  estimates  for 
counties,  since  there  is  a  potential  for  bias  in  the  regression 
estimates  from  this  source. 

Synthetic  estimates  will  be  similarly  affected.  High 
sampling  variances  may  prove  some  limitation  on  the 
complexity  of  the  synthetic  estimates,  but  perhaps  not  to  the 
same  degree  as  for  regression.  Again,  however,  the  amount  of 
information  provided  by  the  sample  data  on  the  usefulness 
and  reliability  of  the  synthetic  estimates  will  be  restricted  to 
about  the  same  degree  as  regression  estimation.  Differential 
biases  in  the  sample  will  affect  the  synthetic  estimates,  but 
here  the  key  question  concerns  the  extent  of  this  bias  by  the 
analytic  categories  used  to  form  the  synthetic  estimate. 

POSSIBLE  EMPIRICAL  BAYES  DIRECTIONS 
FOR  SMALL  AREA  ESTIMATION 

The  preceding  section  identified  a  number  of  general 
methodological  problems  related  to  designing  and  producing 
small  area  estimates  of  census  undercoverage.  The  emphasis 
was  on  obstacles  and  complexities  in  an  effort  to  help  other 
researchers  incorporate  these  considerations  in  planning  a 
program  of  small  area  estimation.  This  section  will  take  a 
more  constructive  tone  and  discuss  some  possible  ways  in 
which  empirical  Bayes  procedures  might  assist  in  the 
estimation. 

First,  the  allocation  and  distribution  of  sample  cases  for 
the  PES  and  CPS  will  be  reviewed  for  possible  implications 
for  specific  small  area  approaches.  Next,  other  aspects  of 
modeling  census  undercoverage  will  be  similarly  explored. 
From  this  point,  a  proposed  general  strategy  will  be  sketched 
for  incorporating  empirical  Bayes  ideas  into  the  estimation. 

Summary  of  the  PES  Sample  Allocation 

As  it  is  currently  envisioned,  the  PES  sample  allocation  is 
designed  to  achieve  a  number  of  purposes.  For  the  sake  of 
discussion,  the  sample  allocation  may  be  thought  to  be 
determined  by  the  following  four  steps: 

1 .  Allocation  of  a  basic  sample  size  to  each  State  in  order 
to  guarantee  a  minimum  target  reliability  on  the 
estimated  percent  net  undercount  for  each  State. 

2.  Supplementation  of  the  sample  in  the  largest  States  in 
order  to  improve  both  regional  and  minority  estimates. 

3.  Supplementation  of  the  sample  in  32  large  central 
cities. 

4.  Supplementation  in  the  SMSA  balances  of  the  pre- 
ceding central  cities  in  order  to  produce  estimates  for 
each  of  these  SMSA's. 

In  most  States,  the  design  will  include  the  standard 
technique  of  selecting  a  first-stage  selection  of  a  sample  of 
smaller  counties  and  adjusting  the  selection  rates  within  each 
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to  maintain  a  constant  unconditional  probability  of  selection 
for  persons  in  the  State.  Not  every  county,  therefore,  will  fall 
into  the  PES  sample. 

As  a  consequence  of  both  the  allocation  and  the  first-stage 
of  selection,  the  number  of  sampled  cases  allocated  to 
individual  counties  will  vary  in  a  complex  manner.  A 
particular  group  of  counties,  those  containing  the  32  central 
cities,  will  have  a  reliability  almost  comparable  to  that  for 
small  States.  A  few  other  counties,  such  as  New  Castle,  Del., 
will  also  have  a  high  level  of  reliability  because  they  represent 
a  large  proportion  of  their  total  State  population.  From  the 
few  counties  that  will  be  quite  reliably  estimated,  there  is  a 
range  progressing  downward  to  a  very  low  level  of  reliability 
for  most  counties.  Counties  not  containing  the  special  cities 
but  in  their  SMSA's  will  be  among  those  in  the  middle  to 
upper  end  of  this  reliability  range.  In  general,  outside  the 
special  SMSA's,  the  reliability  of  the  county  estimates  tends 
to  be  in  proportion  to  its  share  of  the  total  State  population, 
except  that  the  first-stage  of  selection  tends  to  put  a  lower 
floor  on  the  reliability  for  small  sample  counties.  Because  of 
the  original  allocation  to  States,  a  relatively  large  county  in  a 
small  State  generally  has  a  higher  level  of  reliability  than  the 
same  size  county  (in  numbers  of  persons)  in  a  large  State. 

Summary  of  the  CPS  Sample  Allocation 

The  sample  allocation  of  the  CPS  has  evolved  over  time  to 
satisfy  a  number  of  requirements  for  labor  force  data.6  The 
current  design  generally  follows  purposes  1,  2,  and  4  just 
enumerated,  except  that  purpose  4  is  satisfied  for  a  some- 
what different  list  of  SMSA's.  Central  cities  in  the  CPS  are 
not  specially  identified  for  supplementation  except  as  part  of 
their  SMSA's.  Like  the  PES,  the  CPS  design  uses  a  first-stage 
selection  of  counties  in  some  States. 

Again,  the  CPS  design  by  itself  will  yield  a  wide  range  of 
sample  sizes  for  individual  counties.  The  number  of  counties 
with  highly  reliable  estimates  will  generally  be  somewhat  less 
using  the  CPS  by  itself,  but  at  least  a  few  counties  will  have 
quite  reliable  estimates. 

Other  Considerations  in  Modeling  Net  Census 
Underenumeration 

As  a  general  rule,  small  area  estimation  techniques  relate 
data  available  in  great  detail  and  precision  to  u  variable  (such 
as  net  census  underenumeration)  that  is  of  interest  but  with 
data  of  much  less  detail  or  sampling  reliability.  Thus,  the 
basic  issue  in  the  design  of  such  estimates  is  the  method  of 
expressing  these  relationships. 

A  synthetic  estimation  approach  in  this  application  would 
involve  the  cross-classification  of  characteristics  measured  in 


6  The  E-sample  allocation  will  probably  parallel  that  for  the  CPS. 
Hence,  the  comments  here  about  allocation  of  the  CPS  apply  to  the 
evaluation  program  as  of  April  1980. 


the  census  with  census  omissions  and  erroneous  inclusions 
for  persons  in  the  evaluation  samples.  This  procedure  easily 
incorporates  characteristics  attributed  to  persons,  especially 
age,  race,  and  sex.  It  is  possible,  however,  that  a  significant 
number  of  other  measurable  variables  may  also  be  associated 
with  census  undercoverage:  a  number  of  geographic  charac- 
teristics such  as  region,  SMSA  status,  and  size  of  place,  as 
well  as  a  number  of  quality  measures  for  the  census  such  as 
census  close-out  rates.  Such  variables  are  ecological  in  nature 
and  may  relate  general  area  characteristics  to  census  under- 
enumeration. 

It  is  the  author's  view  that  a  simple  synthetic  approach 
cannot  be  readily  adapted  to  incorporate  a  wide  range  of 
variables.  Logistic  regression  for  erroneous  inclusions  and  for 
census  omissions  separately  may  hold  some  promise,  but  a 
more  direct  approach  may  be  to  apply  linear  regression  to 
sample  estimates  of  net  underenumeration  for  sampled 
counties.  This  last  approach,  with  a  few  methodological 
nuances,  could  include  as  an  independent  variable  a  pre- 
liminary synthetic  estimate  for  the  area  based  on 
demographic  characteristics.  Furthermore,  it  is  this  technique 
that  lends  itself  best  to  empirical  Bayes  refinements. 

Empirical  Bayes  Possibilities 

Empirical  Bayes  estimation  may  be  thought  to  include 
two  basic  notions: 

1.  That,  in  order  to  estimate  a  quantity  for  a  single  unit, 
one  may  borrow  information  from  similar  units. 

2.  That  available  sample  data  are  allowed  to  shape,  to  an 
extent,  the  manner  and  degree  to  which  data  from 
other  units  are  used. 

An  earlier  research  effort  (Robert  E.  Fay  III  and  Roger  A. 
Herriot,  1979,  "Estimates  of  Income  for  Small  Places:  An 
Application  of  James-Stein  Procedures  to  Census  Data," 
Journal  of  the  American  Statistical  Association,  74,  pp. 
269-277)  represents  one  of  several  illustrations  of  these  ideas. 
To  review  this  one  example  briefly,  an  empirical  Bayes 
approach  was  employed  to  improve  the  average  accuracy  of 
estimates  of  income  for  small  places.  Sample  data  for  these 
small  places  was  provided  from  the  1970  census,  but  the 
small  size  of  the  sample  in  each  of  these  places  (with  total 
population  less  than  1,000  persons)  yielded  only  limited 
sampling  reliability.  The  method  consisted  essentially  of  four 
steps: 

1.  Fitting  a  regression  to  the  sample  estimates  using 
independent  variables  free  or  virtually  free  of  sampling 
variability  (such  as  per  capita  income  figures  for  the 
entire  respective  counties  or  tax  return  data  for  the 
places). 

2.  Evaluating  the  goodness-of-f it  of  the  regression  relative 
to  independently  established  estimates  of  sampling 
variability. 
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3.  For  each  place  separately,  weighting  the  regression  and 
sample  data  together,  considering  the  average 
goodness-of-fit  of  the  regression  and  the  sampling 
variability  for  the  individual  place. 

4.  Applying  a  constraint  to  prevent  the  final  estimate 
from  substantially  deviating  from  the  sample  estimate 
relative  to  the  sampling  variance. 

Thus,  this  procedure  borrows  information  from  other 
places  through  a  regression  analysis  and  determines  the 
amount  to  which  this  information  is  used  on  the  basis  of  the 
observed  goodness-of-fit  of  the  regression. 

Seemingly,  these  ideas  could  be  directly  transferred  to  the 
problem  of  estimating  net  census  error.  To  do  justice  to  this 
problem,  however,  a  number  of  choices  and  potential 
difficulties  must  be  addressed: 

1.  The  number  of  potential  regression  variables  is  much 
larger.  Empirical  Bayes  procedures  may  be  required  to 
smooth  estimates  of  regression  coefficients  over  classes 


of  independent  variables.  In  some  sense,  an  empirical 
Bayes  procedure  could  in  effect  select  a  model  by  only 
slightly  changing  some  classes  of  coefficients  but 
drastically  smoothing  others. 

2.  Evaluating  the  goodness-of-fit  of  a  regression  equation 
may  hinge  quite  critically  on  the  relatively  small 
number  of  counties  for  which  relatively  precise  sample 
estimates  are  available.  It  has  been  pointed  out  already 
that  this  is  a  fairly  special  group  of  counties. 

3.  The  question  of  geography  may  be  allowed  to  enter  in 
a  complex  way.  It  might  be  argued,  for  example,  that 
county  estimates  should  be  adjusted  to  consistency 
with  the  sample  estimate  for  the  State.  Sample 
estimates  for  major  sub-State  areas  (say  the  metro/ 
nonmetro  split)  may  also  be  used,  but  perhaps  through 
some  additional  empirical  Bayes  procedure. 

Certainly,  there  are  many  possible  paths  to  elaborate  these 
basic  ideas  into  a  working  procedure.  Much  will  depend  upon 
the  sampling  variances  that  are  achieved  by  the  evaluation 
program. 


Comments 


Tommy  Wright 

Union  Carbide  Corporation 


Since  the  paper  by  Dr.  Fay  was  not  available  for  review 
prior  to  the  conference,  my  comments  will  be  limited  to  the 
paper  by  Professor  Kish  and  the  paper  by  Professor  Dempster 
and  Mr.  Tomberlin. 

One  of  the  marks  of  excellent  teachers  is  the  value  they 
place  on  a  good  foundation.  That  is,  excellent  teachers  first 
emphasize  the  ideas,  concepts,  and  definitions.  This  appears 
to  be  the  theme  of  the  paper  by  Professor  Kish. 

Many  times  I  suspect  that  simply  because  different  terms 
are  used  to  describe  the  same  thing,  we  have  several  people 
getting  the  same  results  independently  of  each  other.  A 
classical  example  of  this  is  perhaps  the  optimal  allocation 
credited  to  Neyman  [3]  in  stratified  sampling  which  was 
published  1 1  years  earlier  by  Tschuprow  [5]  . 

The  five  categories  of  missing  units  as  listed  by  Professor 
Kish  [(1)  item  nonresponse,  (2)  total  nonresponse,  (3) 
cluster  nonresponse,  (4)  noncoverage,  and  (5)  deliberate  and 
explicit  exclusions]  appear  to  be  sufficiently  broad  to  cover 
completely  everything  one  would  want  to  consider  when 
talking  about  various  kinds  of  missing  data.  The  extent  to 
which  one  should  be  worried  about  missing  data  involves 
consideration  of  the  consequences.  Kish  notes  that  the 
consequences  tend  to  be  worst  when  one  is  concerned  with 
the  estimation  of  simple  totals. 

Uniform  terminology  and  consistency  of  ideas  and  con- 
cepts are  important  considerations  if  several  people  are  to 
communicate  their  ideas  to  each  other  with  minimum 
misunderstanding. 

The  paper  by  Professor  Dempster  and  Mr.  Tomberlin 
introduces  the  concept  of  "the  probability,  p,  that  an 
individual  will  be  enumerated  in  the  census."  The  proposed 
method  of  attack  is  a  PES  with  a  multistage  design.  The 
probability,  p,  is  a  function  of  several  factors,  including 
factors  that  consider  geography,  facts  that  consider  social 
characteristics,  etc.  One  seeks  an  estimate  for  the  under- 
count,  U,  and  uses  ratio  estimation  of  the  type 


u     ^c 


(1) 


where  p  is  an  estimate  for  p(q  =  1  -  j&V,  C  is  the  number  of 
people  in  the  PES  who  were  counted  in  the  census,  and  U  is 
the  estimate  of  the  undercount.  Ways  of  estimating  U  at 
various  levels  and  for  various  groups  of  interest  are  discussed 
with  the  use  of  a  logistic  model 

logit(p)  =  log—  =  9  +  sum  of  other  parameters         (2) 

where  8  is  fixed. 
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The  other  parameters  are  fixed  in  some  cases  and 
permitted  to  vary  in  others.  (The  model  seems  to  permit 
much  flexibility.)  When  the  parameters  vary,  prior  distribu- 
tions are  assumed  and  discussions  involving  Bayesian 
procedures  and  posterior  distributions  surface.  Attention 
centers  on  estimation  of  small  areas.  The  authors  give  a 
sketch  of  their  plan  and  acknowledge  that  research  is  needed 
to  fill  a  number  of  gaps.  But  they  feel  reasonably  confident 
that  the  gaps  can  be  filled  and  cite  a  number  of  recent 
articles  that  support  their  claims. 

The  Bayesian  framework  is  not  totally  clear  in  the  usual 
sequence  of  parameter,  prior  distribution,  loss  (though 
quadratic  loss  is  implied),  observations  dependent  on  param- 
eter, and  posterior  distribution.  Specifically,  is  there  a  prior 
on0    in  (8)? 

I  agree  that  there  is  great  hope  in  "matching  studies"  and 
we  should  note  the  planned  use  of  this  technique  for  the 
1980  census  (also  earlier  censuses)  as  noted  in  the  report, 
"Counting  the  People  in  1980:  An  Appraisal  of  Census 
Plans."  However,  it  appears  that  the  type  of  matching  study 
need  not  be  limited  to  the  postenumeration  survey  type  as 
suggested  by  Dempster  and  Tomberlin  and  seemingly  favored 
by  the  Census  Bureau.  One  might  also  consider  a  pre- 
enumeration  survey  or  a  procedure  where  various  kinds  are 
unknowingly  "captured  and  tagged"  (only  on  paper)  either 
by  a  purposive  process  or  by  some  randomization  process 
independent  of  the  census  enumeration  process.  The-possible 
noncoverage  of  important  subgroups  or  undercount  of  the 
PES  that  Dempster  and  Tomberlin  alude  to  in  their  paper 
might  be  avoided  with  such  a  technique.  (We  have  reports 
from  the  Bureau  that  there  appears  to  be  a  tendency  for  the 
PES  to  miss  the  same  groups  as  the  census.)  Of  course  some 
sort  of  controlled  selection  is  no  doubt  used  in  the  PES  to 
decrease  the  probability  of  noncoverage  of  important  sub- 
groups. Considering  the  huge  amount  of  available  data  before 
each  census,  it  seems  only  natural  that  one  would  want  to 
consider  the  undercount  problem  in  the  Bayesian  setting. 

The  thoughts  of  Dempster  and  Tomberlin  remind  me  of  a 
problem  of  recent  interest  to  me  in  Health  and  Safety 
Research,  which  I  will  briefly  mention  below  using  two 
slightly  different  models.  (See  Wright  [6]  and  Bratcher, 
Schucany,  and  Hunt  [1] ). 

Model  I.  We  are  given  a  finite  population  of  size  A/.  After 
the  census,  we  assume  that  the  number  of  people  missed  is 
M.  (N  =  N '  +  M,  where  Nc  is  the  number  counted  in  the 
census).  Suppose  that  we  have  prior  feelings  concerning  the 
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true    value    of    M    and   that    it    is   expressed    by    using   the 
beta-binomial  prior  distribution 


fx(M\a$,N)  = 
f or  M  =  0,  1 ,  .  .  .  ,  N,  where 


ft) 


B(M+a,N-M+l3) 
Biafi) 


a.0)=  f  ta~l  d-f)'3"1 


B( 


for  a  >  0  and  (3  >  0. 

The  values  of  a  and  )3  are  chosen  to  represent  the  state  of 
prior  knowledge  about  M.  This  family  is  suitable  because  it  is 
"rich"  and  the  mathematics  is  tractable.  We  next  take  a 
random  sample  PES  of  size  n  and  observe  that  m  of  the  n 
were  missed  in  the  original  census.  It  is  routine  to  show  in 
this  setting  that  the  posterior  distribution  of  M  given  the 
sample  is  also  beta-binomial 


f3  (M\m 


-  (^  ) 

\M-m) 


B{M+a,N-M+P) 
B(m+a,n-m+(5) 


f or  M  =  m,  m  +  1 N  -  n  +  m. 

An  estimate  of  M  can  be  found  as  follows.  Let 


*  m  +  a. 

M  =  E{M\m)  =  (N-n) 


n  +a  +  (3 
Substituting  N '+  iQ  for  N  and  solving  for  M  gives 


a  m  +  a 

M=(N-n) 


c       n  -m+j3 


(Recall  U  =  %-C.) 


Model  II.  We  are  given  a  finite  population  of  size  A/,  we 
believe.  The  population  contains  an  unknown  number  of 
people  in  category  1,  say  Nx .  Assume  a  beta-binomial  prior 
on  Nx  of  the  form 


fl(Nl\ctl,&l,N) 


a 


g(/Vi+g1,/V-/V1+l31) 
B{alfM 


for  AT,  =0,1, N. 

To  find  A/j  (i.e.  estimate  Nx ),  we  take  a  census  and 
observe  n  people  altogether  of  which  ny  are  observed  to  be  in 
category  1.  We  look  upon  then  people  observed  as  a  sample, 
and  in  particular  a  random  sample.  (Perhaps  a  "Bernoulli 
Sample"  is  more  appropriate  (see  Strand  [4] ).  After  the 
census,  the  overall  undercount  is  /V  -  n  and  the  undercount 
for  category  1  is  Nx  -  nx .  (That  is,  the  number  of  persons 
missing  are  M=N~n  and  Mx=Nl-n).  The  posterior  distribu- 
tion for  Ni  given  the  sample  is 


f3(N1\n1) 


(N-n     \ 
\Ni-nJ 


B(nl+a1/i-nl+(}i ) 


for  /Vj  =  A7i,/7,  +  1 N -  n  +  nx 

Now  given  that  we  observe  A7j  people  in  category  1  in  the 
census  (random  sample)  of  size  n,  the  probability  that  there 
are  exactly  N/^  people  in  category  1  in  the  uncounted  part 
of  the  population  is 

P(N1=N/L+nl\nl) 

(N-n\  B{N,/+n1+a1,N-N//-n1+pl) 

A71|ni)=(  J — — 

VV/Z_/  B(ni+a1/7-n1+^1) 


=  h  (NjL+ 


The  above  models  are  presented  here  merely  as  alterna- 
tives of  possible  initial  considerations  when  considering 
models  of  the  undercount  problem.  Indeed,  the  assumptions 
of  model  2  are  suspect,  for  example,  to  assume  that  the 
people  enumerated  forms  a  random  sample  of  the  total 
population.  However,  it  might  be  a  starting  point.  One 
advantage  of  Model  2  is  that  it  can  easily  be  extended  to 
determine  the  probability  of  missing  anyone  in  k  categories 
and  can  give  estimates  for  M\ ,  Mi,  .  . . ,  M^.  We  would  take  a 
Dirichlet-Multinomial  as  our  prior  distribution. 

I  would  like  to  close  with  the  following  comments. 

(1)  It  is  interesting  to  note  that  the  law  dictates  a 
complete  enumeration  and  objects  to  sample  estima- 
tion. However,  the  final  numbers  are  adjusted  figures 
due  to  sampling.  In  the  Kish  [2]  spirit,  perhaps 
the  lawmakers  should  ask,  in  this  "Age  of  Survey 
Sampling,"  what  do  we  want  to  accomplish?  After 
this  issue  is  settled,  one  can  ask  how  to  accomplish 
these  goals.  It  may  mean  a  complete  census  is  still 
necessary;  it  may  mean  that  sampling  might  be 
sufficient;  or  it  might  mean  that  one  can  make  use  of 
censuses  and  surveys  in  the  enumeration  process. 

(2)  It  would  also  be  of  interest  to  consider  whether  a 
complete  census  every  10th  year  might  not  be  in 
some  way  equivalent  to  one- tenth  of  a  census  every 
year  for  10  years,  or  to  one-half  of  a  census  every 
fifth  year  for  10  years,  etc.  .  .  . 
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Floor  Discussion 


A  question  arose  concerning  clarification  of  a  variance 

that  Dr.  Fay  mentioned  being  increased  by  a  factor  of  6  to 

20.  There  was  discussion  of  the  magnitude  of  the  variance, 

since  it  may  shed  some  light  on  how  difficult  the  problems 

are  that  are  being  discussed.  Dr.  Fay  responded  that  as  a  first 

cut,  instead  of  using_J— where  p  is  the  census  undercoverage 

n 
rate  and  n  is  the  number  of  sample  persons— as  a  variance  of  a 

sample  estimate  of  missed  persons,  the  average  number  of 

persons  per  household  was  used  as  a  design  effect  so  that 

instead  of  dividing  by  the  number  of  persons,  division  was  by 

the   number   of   households  to  approximate   the  variance. 

However,  because  of  the  area-sampling  nature  of  the  PES, 

because  a  problem  can  occur  within  a  block  (such  as  the 

complete  miss  of  a  block  in  the  census),  and  because  the 

missed  rate  includes  both  erroneous  omissions  and  erroneous 

inclusions  that  are  both  very  large  and  not  highly  enough 

correlated    to    offset    each    other,    the    variance    is   greatly 

increased.   This  was  determined   by   taking  census  pretest 

results  and  simulating  the  variance  that  would  be  obtained 

for  a  sample  designed  to  take  a  probability  proportional  to 

size-sample  of  blocks  with  subsampling  within  blocks  for  an 

equal-weighting  sample.  It  was  found  that  the  design  effect  in 

these  data  meant  that  instead  of  a  measure  of  the  coefficient 

of  variation  for  the  corrected  population  of  0.3  percent  at 

the  State  level,  it  might  be  roughly  in  the  neighborhood  of 

1.0  percent,  which  may  yield  statements  for  estimates  of 

undercounts  for  States  that  could  be  as  large  as  2  percent, 

plus  or  minus.  This  is  somewhat  disheartening.  Whether  this 

situation  will  reoccur  in  1980  is  questionable. 

Further  clarification  of  Dr.  Fay's  point  indicated  that  by 
doing  computations  using  robust  methods  of  estimating  the 
variance  of  the  results,  the  variance,  instead  of  being  — 3-,  was 
6  to  20  times  that  big. 

It  was  questioned  whether  this  meant  that  the  PES  is 
anticipated  to  be  useless.  The  Census  Bureau  responded  that 
this  is  only  a  variance  consideration.  If  the  same  results 
materialize  in  1980,  the  Bureau  will  have  much  bigger 
variances  than  it  had  hoped  to  obtain,  but  that  may  not 
make  the  data  useless.  It  was  emphasized  also  that  these 
concerns  apply  only  to  the  PES.  The  Bureau  does  not  have  a 
similar  evaluation  as  to  what  may  be  expected  from  the  CPS, 
which  may  turn  out  better  as  it  is  less  clustered. 

Built-in  overall  surveillance  that  protects  against  the 
effects  of  extreme  outliers  or  wild  values  was  also  suggested. 
The  Bureau  indicated  that  this  could  be  done,  except  that 
perhaps  the  direct  sample  estimate  is  the  most  direct  way  of 
representing  what  really  happenas  nationally;  that  is,  with 


a  very  large  sample,  one  can  even  admit  into  the  estimate  the 
effects  of  those  outliers.  It  was  felt  that  perhaps  models 
should  be  made  to  handle  outliers  subnationally,  particularly 
when  State  estimates  are  derived,  as  one  might  not  be  able  to 
accept  the  effects  of  such  things  at  a  State  level. 

The  group  inquired  as  to  whether  or  not  the  sampling  of 
census  records  for  the  administrative  records  match  will  be 
on  a  block  level  so  that  entire  blocks  will  be  sampled.  If  so,  it 
may  be  more  feasible  for  the  estimates  to  be  in  terms  of  the 
proportion  of  the  block  missed  rather  than  the  likelihood  of 
the  individual  being  missed.  Then  blocks  could  be  looked  at 
in  terms  of  the  probability  of  major  problems  occurring. 

Although  the  value  and  future  of  the  PES  had  been 
questioned  earlier  in  the  discussion,  it  was  suggested  that 
some  form  of  data  file  be  constructed  that  can  be  used  by 
outsiders.  It  was  felt  that  a  serious  difficulty  in  the  research 
into  the  undercount  has  been  that  much  of  the  data  has  been 
unavailable  to  outsiders.  It  was  concluded  that  many  of  the 
suggestions  made  at  the  conference,  in  fact,  would  have  been 
improved  if  people  had  been  able  to  conduct  some  explora- 
tory analysis  on  the  data.  It  was  recommended  that  some  of 
the  work  done  in  1970  be  made  into  a  public  use  file, 
particularly  matching  work  done  with  administrative  records 
involving  the  CPS,  which  was  not  finished  in  a  form  that 
could  be  made  public. 

Given  the  life  expectancy  of  administrative  records,  little 
problem  with  data  confidentiality  should  be  experienced 
now  with  those  types  of  data,  provided  that  geographic 
identification  is  still  restricted.  This  might  be  of  particularly 
great  potential,  since  any  undercount  adjustment  made 
should  be  consistent  with  administrative  systems,  given  the 
heavy  use  made  of  administrative  records  by  policymakers. 
Multisystem  matching  at  least  on  a  sample  basis  also  was 
reinforced  (e.g.,  Medicare,  Social  Security,  Employment 
Security,  Internal  Revenue  Service). 

There  may  be  some  difficulty  experienced  because  the 
populations  covered  by  the  files  do  not  overlap  sufficiently, 
which  creates  difficulty  in  multiple-system  estimation  even- 
though  sampling  is  large;  that  is,  there  is  a  possibility  for 
large  sampling  errors,  but  also  large  biases  if  the  underlying 
assumptions  are  not  met,  as  they  might  not  be.  However, 
having  these  types  of  data  available  at  a  very  detailed 
geographic  level  over  many  years  could  make  significant 
contributions  to  some  of  the  synthetic  procedures  that  are 
goingto  be  applied. 

The  nonlinear  form  of  the  synthetic  estimates  introduced 
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by  Dempster  and  Tomberlin  also  was  supported.  This  was 
thought  possibly  to  be  a  partial  answer  to  the  problem 
referred  to  by  Dr.  Fay— the  fact  that  synthetic  estimates  may 
collapse  when  one  tries  to  combine  them  due  to  the 
nonlinear  nature  of  the  undercount. 

It  was  also  speculated  that  some  of  the  problems  with  the 


variance  referred  to  by  Dr.  Fay  may  not  point  to  the 
necessity  to  go  to  an  indirect  synthetic  or  regression-type 
estimate  even  for  the  areas  where  there  are  data  for  making 
direct  estimates,  or  possibly  a  Stein- James  type  of  combina- 
tion, which  would  give,  in  this  case,  very  low  weight  to  the 
direct  estimate  because  of  its  high  variance. 


Impacts  of 
Adjusting 


The  Impact  of  Census  Undercoverage 
on  Federal  Programs 


Courtenay  M.  Slater 
U.S.  Department  of  Commerce 


The  widespread  interest  in  statistical  adjustment  of 
decennial  census  data  stems  in  large  part  from  the  conviction 
that  differential  population  undercoverage  produces  serious 
inequity  in  the  administration  of  Federal  programs, 
especially  programs  which  distribute  funds  in  accordance 
with  statistical  formulas.  Since  many  billions  of  dollars  are 
distributed  each  year  through  such  formula  grant  programs, 
the  concern  is  understandable.  Yet  the  kind  of  compre- 
hensive evaluation  of  the  impact  of  census  undercoverage  on 
Federal  programs  needed  for  an  informed  decision  on 
adjustment  of  the  undercount  is  lacking. 

In  order  to  initiate  structured  discussion  and  investigation 
of  the  Federal  program  impacts  of  possible  adjustment  for 
census  undercoverage,  members  of  the  interdepartmental 
Statistical  Policy  Coordination  Committee  (SPCC)  recently 
were  asked  to  respond  to  the  following  questions: 

1.  How  would  statistical  series  prepared  by  Federal 
agencies  be  affected  by  adjustment  of  the  census  data? 

2.  What  Federal  fund  allocations  or  other  program 
administration  activities  would  be  affected  and  to  what 
extent? 

3.  Assuming  adjusted  data  would  become  available  in 
1983,  what  problems  would  be  created  by: 

(a)  use  of  preliminary   (unadjusted)  data  during  the 
interim; 

(b)  shifting  to  revised  (adjusted)  data  when  it  becomes 
available? 

4.  In  addition  to  population  totals,  what  other 
characteristics  (e.g.,  income,  employment,  age,  race, 
sex,  ethnic  origin)  would  be  required  on  an  adjusted 
basis  for  Federal  program  administration?  What  would 
be  the  impact  on  the  agencies  of  having  some  data 
available  on  an  adjusted  basis  and  other  data  available 
only  on  an  unadjusted  basis? 

5.  What  degree  of  geographic  detail  would  be  required  for 
adjusted  data  used  in  Federal  program  administration? 

Drawing  on  agency  responses  to  this  request  as  well  as  on 
earlier  research  work  done  at  the  Census  Bureau  and 
elsewhere,  this  paper  attempts  to  identify  some  of  the 
Federal  program  considerations  which  should  enter  into 
decisions  on  whether  corrections  for  census  underenu- 
meration  should  be  made  and,  if  so,  how  they  should  be 
made  statistically. 


IMPACT  OF  UNDERCOVERAGE  ON 
STATISTICAL  SERIES 

First,  what  are  the  statistical  programs  which  are  affected 
by  census  undercoverage  and  what  is  the  nature  of  the 
effects?  The  affected  statistical  programs  are  not  limited  to 
the  census  data  itself.  Other  statistical  programs  are  affected 
in  at  least  three  important  ways:  Through  use  of  census  data 
to  derive  postcensal  estimates,  through  the  use  of  census 
figures  as  a  sampling  frame  and  as  the  basis  for  control  totals, 
and  through  use  of  census  data  as  the  denominator  in  ratio 
calculations. 

The  Census  Bureau's  population  and  per  capita  income 
estimates  illustrate  the  use  of  census  data  to  derive  postcensal 
estimates.  The  monthly  Current  Population  Survey  (CPS) 
illustrates  the  use  of  census  counts  as  a  sampling  frame  and  as 
the  basis  for  control  totals.  Vital  statistics  (birth  and  death 
rates)  illustrate  the  use  of  census  population  totals  as  a 
denominator.  Per  capita  income  estimates  are  another 
example  of  the  use  of  population  as  a  denominator. 

Postcensal  Estimates 

Since  the  inception  of  the  general  revenue  sharing 
program  in  197 ^,  the  Census  Bureau  has  prepared  local  area 
population  and  income  estimates  biennially  for  use  in 
distributing  these  funds.  Data  from  the  1970  census  serve  as 
the  starting  point  for  these  estimates,  and  the  estimates  are 
affected  both  by  1970  population  undercoverage  and  by 
underreporting  of  income  among  those  who  were  counted  in 
the  1970  census. 

Current  Population  Survey 

The  CPS,  a  monthly  Census  Bureau  survey  of  71,000 
households,  is  the  source  of  a  wide  range  of  current  data  on 
household  characteristics.  Most  importantly  for  the  present 
discussion,  it  is  the  source  of  monthly  national  data  on  labor 
force  employment  and  unemployment,  it  is  a  crucial 
component  of  local  area  employment  and  unemployment 
estimates,  and  it  is  the  source  of  widely  used  annual 
estimates  of  individual  and  family  income  and  poverty. 
Postcensal  estimates  consistent  with  the  decennial  census 
provide  control  totals  for  the  CPS,  and  census  undercoverage 
leads  to  similar  undercoverage  rates  in  the  CPS. 
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Vital  Statistics:  Per  Capita  Income  Estimates 

Birth,  marriage,  divorce,  and  death  rates  are  computed  by 
the  National  Center  for  Health  Statistics  (NCHS)  as  the  ratio 
of  registered  births,  marriages,  etc.,  to  population  totals 
provided  by  the  Census  Bureau.  With  one  exception,  these 
statistical  series  make  no  adjustment  for  census  under- 
coverage.  Per  capita  income  estimates  prepared  by  the 
Bureau  of  Economic  Analysis  (BEA),  which  similarly  utilize 
census  population  data,  have  widespread  program  uses  at  all 
levels  of  government.  The  income  estimates  are  obtained 
from  sources  independent  of  the  census,  largely  adminis- 
trative records.  These  income  totals  then  are  divided  by 
population  figures  which  do  reflect  census  undercoverage, 
causing  an  overstatement  of  income  per  capita. 

IMPACT  OF  UNDERCOVERAGE  ON 
FUND  ALLOCATION 

Evaluation  of  the  impact  of  census  undercoverage  on 
Federal  funding  programs  must  take  into  account  the  use  of 
census  data  in  preparing  other  statistical  series,  as  illustrated 
above,  as  well  as  the  direct  use  of  census  data  in  program 
administration.  Some  of  the  major  program  uses  of  the 
affected  statistical  data  are  described  below.  In  some  cases 
the  impact  of  undercoverage  can  be  illustrated  using  past 
data;  in  others,  no  studies  illustrating  such  effects  have  been 
located. 

Revenue  Sharing 

The  program  whose  reliance  on  census  data  is  best  known 
and  most  thoroughly  studied,  undoubtedly,  is  general  rev- 
enue sharing.  Under  this  program,  nearly  $7  billion  will 
be  distributed  to  State  and  local  governments  this  fiscal  year 
based  on  a  formula  utilizing  population,  per  capita  income, 
and  tax  effort.  Studies  by  Jacob  Siegel  and  others  have 
demonstrated  that  income  undercoverage  has  a  far  greater 
impact  on  the  distribution  of  revenue-sharing  funds  than 
does  population  undercoverage.  This  is  true  of  funds  dis- 
tributed to  the  States  (1/3  of  the  total)  and  even  more 
pronounced  with  respect  to  funds  distributed  to  localities 
(2/3  of  the  total).  Siegel's  1975  study  indicates  that 
adjustment  for  income  underreporting  would  have  shifted 
revenue-sharing  allocations  by  3  percent  or  more  for  10 
States.  In  contrast,  adjustment  of  the  population  count 
would  not  have  affected  any  State  by  as  much  as  3  percent. 

A  1979  study  by  Siegel  and  Robinson  examines  impacts 
of  adjustment  on  localities  in  Maryland  and  New  Jersey.  The 
dominance  of  the  income  factor  is  even  greater  in  the 
formula  governing  distributions  to  localities.  The  authors 
conclude: 

.  .  .  per  capita  income  is  the  dominant  element  among  the 
various  elements  in  the  revenue-sharing  formula  at  the 


sub-State  level  and  has  a  superordinate. effect  when  data 
errors  in  the  formula  are  corrected.  Large  shifts  in  funds 
among  counties  and  local  areas  result  when  the  per  capita 
income  factor  is  corrected.  .  .  .  The  transfer  of  funds 
among  areas  that  is  realized  when  the  population  compo- 
nent is  adjusted  for  underenumeration  is  small  by 
comparison.  If  the  cause  of  equity  is  to  be  served  in  the 
distribution  of  revenue-sharing  funds  to  local  areas,  it  may 
be  more  important  to  develop  and  apply  accurate 
corrections  for  the  data  on  income  than  to  apply 
corrections  for  the  population  counts. 

Several  points  are  worth  noting: 

1.  In  these  studies,  the  income  data  were  adjusted  for 
both  the  underenumeration  of  the  population  and  the 
underreporting  of  income  by  those  who  were  counted. 

2.  Adjustment  for  both  population  and  income  caused 
little  increase  in  the  shift  of  apportioned  funds, 
compared  to  adjusting  income  alone,  especially  at  the 
local  level.  Funds  allocated  to  localities  in  Maryland 
and  New  Jersey  would  have  been  shifted  by  an  average 
of  9  percent  by  this  combined  adjustment,  but  the 
overwhelmingly  important  sources  of  the  adjustment 
are  the  income  factors  in  the  formula. 

3.  Results  vary  depending  on  the  method  utilized  to 
adjust  for  the  undercount.  Most  of  the  results 
described  here  are  based  on  a  simple  synthetic  adjust- 
ment. Tests  of  different  adjustment  methods  at  the 
State  level  in  some  cases  yield  fairly  large  differences  in 
the  results.  Alternative  undercount  adjustment 
methods  at  the  local  level  have  yet  to  be  evaluated. 

4.  Since  a  fixed  funding  total  is  assumed,  the  amount  of 
money  withdrawn  from  some  governments  by  defini- 
tion equals  the  increase  provided  others.  However,  in 
the  Maryland-New  Jersey  study,  the  number  of  local 
governments  losing  funds  consistently  exceeded  the 
number  of  gainers.  In  Maryland,  the  number  of  losers 
outnumbered  the  gainers  by  4  to  1;  similarly,  but  less 
dramatically,  56  percent  of  local  governments  in  New 
Jersey  would  have  been  losers.  It  also  seems  to  be  the 
case  that  more  often  than  not  the  losing  localities 
represent  a  larger  fraction  of  total  population  than  do 
the  gainers. 

Conclusions  about  which  one  might  speculate  based  on 
these  studies  are: 

1.  Adjustment  of  population  counts  only  appears  to  be  of 
limited  value.  Adjustment  of  income  clearly  seems 
more  important,  but  is  of  greater  complexity  because 
independent  sources  must  be  found  to  adjust  for 
income  underreporting  by  those  covered  by  the  census. 

2.  The  political  acceptability  of  adjustment  appears  in 
doubt,  since  losers  outnumber  gainers. 
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3.  Equity  in  the  allocation  of  revenue-sharing  funds  might 
more  readily  be  achieved  by  revising  the  formulas 
rather  than  adjusting  the  census. 

4.  Finally,  discussion  of  the  basic  merits  of  revenue 
sharing  lies  outside  the  scope  of  this  paper,  but  it  is 
worth  noting  that  the  present  authorization  expires  in 
September  of  this  year,  and  the  future  of  the  program 
is  not  entirely  certain. 

Programs  Relying  on  the  Current  Population  Survey 

As  indicated  above,  the  CPS  is  a  source  of  labor  market 
information  and  information  on  income  and  poverty.  Both 
types  of  data  are  widely  used  in  program  administration. 
Fund  allocations  based  in  part  on  CPS  employment  and 
unemployment  data  include: 

•  General  training  and  employment  programs,  budgeted 
at  $2.8  billion  in  1981 

•  Public  service  employment,  $4.4  billion 

•  Youth  employment  programs,  $2.6  billion 

•  Countercyclical    and    targeted    fiscal    assistance,    $1.0 
billion 

Data  on  the  prevalence  of  poverty  or  low  income  also 
enter  into  the  allocations  for  some  of  the  above  programs. 
Other  programs  utilizing  household  income  data  from  the 
CPS  as  an  element  in  determining  fund  allocations  include: 

•  Community    development   block   grants,    budgeted   at 
$3.8  billion  in  1981 

•  Urban  development  action  grants,  $0.4  billion 

•  Low-income  energy  assistance,  $2.4  billion. 

These  lists  are  not  exhaustive,  but  they  are  sufficient  to 
illustrate  that  we  are  concerned  with  a  wide  variety  of 
programs  and  that  large  sums  of  Federal  monies  are  involved. 
It  is  not  possible  to  cover  here  all  the  ramifications  of  the  use 
of  CPS  data  in  each  of  these  programs,  but  several  points 
may  be  noted. 

1.  In  general,  each  program  uses  a  different  set  of  CPS 
data,  and  the  data  enter  into  the  formula  in  different 
ways.  Some  use  unemployment  rates,  some  use 
absolute  numbers  of  employed  or  unemployed,  some 
use  the  number  of  persons  below  the  poverty  level, 
some  use  multiples  of  the  poverty  threshold,  and  so 
on. 

2.  Typically,  these  programs  require  data  estimates  for 
small  places,  in  some  cases  for  each  of  the  Nation's 
39,000  units  of  general  government. 

3.  The  local  area  employment  and  unemployment  data 
used  in  administering  the  various  employment  and 
training  programs  are  obtained  only  partially  from  the 
CPS.  However,  the  other  data  sources  used  in  these 


programs  also  rely  heavily  on  the  decennial  census  as  a 
starting  point.  The  Commissioner  of  Labor  Statistics 
states  that  "It  is  no  exaggeration  to  say  that  every 
series  published  in  the  Local  Area  Unemployment 
Statistics  program  would  be  affected  by  [census 
undercoverage]  adjustment." 

Even  so,  because  the  uses  of  the  data  are  so  varied,  the 
effects  of  census  undercoverage  are  difficult  to  estimate.  The 
use  of  the  census  as  a  sampling  frame  is  not  thought  to  be  the 
source  of  any  major  CPS  error,  since  various  procedures  are 
followed  to  adjust  the  sampling  frame.  The  use  of  census- 
derived  population  estimates  as  control  totals  for  interpreting 
the  survey  results  does  cause  census  undercoverage  to  affect 
CPS  estimates,  however. 

Some  examination  has  been  made  of  the  effects  of  census 
undercoverage  on  national  employment  levels  and  rates. 
These  studies  indicate  that,  although  the  number  of  persons 
identified  as  employed  and  unemployed  would  be  signifi- 
cantly increased  by  a  synthetic  adjustment  for  census 
undercoverage,  unemployment  rates  would  be  affected  very 
little.  Johnston  and  Wetzel's  1969  study  estimated  that  the 
national  unemployment  rate  in  1967  would  have  risen  only 
from  3.8  to  3.9  percent  through  correction  for  census 
undercoverage.  This  result  held,  whether  the  uncounted  were 
assumed  to  have  the  same  labor  force  status  as  the  average 
for  their  age,  sex,  and  race,  or  whether  they  were  assumed  to 
have  the  characteristics  of  those  of  the  same  age,  sex,  and 
race  living  in  urban  poverty  neighborhoods.  More  recent 
Census  Bureau  calculations  have  also  shown  little  change  in 
the  unemployment  rate  when  it  is  corrected  for  census 
undercoverage.  On  balance,  the  uncounted  fall  into  age,  sex, 
and  race  groups  having  unemployment  rates  higher  than  the 
published  national  average,  but  they  are  too  small  a 
percentage  of  the  total  to  have  much  effect  on  the  national 
unemployment  rate. 

It  may,  of  course,  be  argued  that  those  who  are  not 
counted  in  the  census  have  even  higher  unemployment  rates 
than  the  average  residents  of  poverty  neighborhoods,  but, 
not  having  managed  to  count  these  people,  we  have  no  data 
with  which  to  test  this  assertion. 

It  may  also  be  argued  that  it  is  not  the  national  average 
unemployment  rate,  but  the  rates  for  various  age,  race,  or  sex 
categories  which  are  of  concern.  Surprisingly,  the  available 
studies  suggest  that  correction  for  census  undercoverage  may 
be  slightly  lower  than  the  national  unemployment  rate  for 
blacks.  This  is  because  undercoverage  is  especially  high  for 
adult  black  males,  a  group  with  a  lower  unemployment  rate 
than  that  for  all  blacks.  Intuitively,  one  resists  this  result. 
Surely,  the  uncounted  have  higher  than  average  unemploy- 
ment rates,  but  test  results  show  that  the  census  under- 
coverage rate  for  black  employed  is  higher  than  for  black 
unemployed.    Intuition    is    sometimes    a    dangerous    guide. 

It  might  further  be  argued  that  national  data  are  of  little 
interest  in  the  present  context;  it  is  accurate  measurement  of 
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regional  variation  which  is  of  concern.  Certainly,  the  degree 

of  regional  variation  is  of  concern.  Very  little  information 

about  it  is  presently  available,  however. 

Use  of  household  income  and  poverty  data  from  the  CPS 

raises  a   new  set  of  complications.   Poverty  is  a  household 

rather  than  an  individual  concept;  determination  of  whether 

a  household  is  below  the  poverty  line  depends  on  household 

size  and  household  income.  Estimates  of  census  undercoverage 

have  focused  on  the  number  of  uncounted  individuals.  In 

order  to  examine  the  effect  of  undercoverage  on  poverty 

estimates,  assumptions  or  imputations  would  be  necessary 

regarding  characteristics,  not  of  the  individuals  themselves, 

but  of  the  households  in  which  they  reside.  A  large  part  of 

census   undercoverage   stems   from   failure   to  count  entire 

households.    Presumably,  one    could    make    some    sort   of 

imputations  about  the  characteristics  of  these  households. 

However,   it  would  seem  quite  risky  to  assume,  without  a 

great  deal  more  investigation,  that  such  arbitrary  imputations 

would,  in  fact,  produce  either  more  accurate  information  or 

greater  equity  in  the  allocation  of  funds. 

What  conclusions  or  speculations  can  one  draw  from  all 

this? 

1.  The  CPS,  heavily  reliant  on  decennial  census  data  for 
its  design  and  execution,  is  a  major  source  of  statistics 
used  in  the  allocation  of  Federal  funds.  Its  combined 
uses  total  a  great  deal  more  money  than  is  involved  in 
the  general  revenue  sharing  program. 

2.  The  extent  of  the  error  introduced  into  the  CPS, 
because  of  census  undercoverage,  can  be  and  has  been 
at  least  roughly  measured.  However,  little  attention  has 
been  focused  on  the  geographic  distribution  of  the 
error  nor  on  the  implications  for  Federal  programs.  A 
great  deal  more  research  is  in  order. 

3.  As  was  the  case  with  the  general  revenue  sharing 
program,  it  is  not  just  accurate  population  counts 
which  are  important.  One  must  also  know— or  be  able 
to  estimate— characteristics  of  the  uncounted;  not  only 
race,  age,  and  sex,  but  employment  status  and  some- 
thing about  the  size  and  income  of  the  households  in 
which  they  live.  It  is  hoped  that  the  postcensal 
evaluation  program,  planned  in  conjunction  with  the 
1980  census,  will  provide  information  which  will  yield 
some  clues  regarding  these  characteristics  of  the 
uncounted.  At  the  moment,  however,  we  do  not  have 
the  information  necessary  to  impute  characteristics 
such  as  these  with  any  useful  degree  of  accuracy. 

Programs  Utilizing  Per  Capita  Income  Estimates 

As  noted,  census  population  counts  and  postcensal 
population  estimates  based  on  these  counts  are  widely  used 
as  denominators  for  computing  other  statistical  series.  The 
per  capita  income  estimates  prepared  by  the  Bureau  of 
Economic    Analysis   are  a  leading  example  of  such   a  use. 


These  per  capita  income  statistics  are  utilized  in  admini- 
stering a  multitude  of  Federal  programs.  The  BEA  has 
identified  fund  allocation  uses  which  totaled  $29  billion  in 
fiscal  1979.  This  is  not  necessarily  an  exhaustive  list.  The 
typical— though  not  the  only— purpose  of  using  these  esti- 
mates is  to  identify  the  prevalence  of  low-income  popula- 
tions in  need  of  various  types  of  Federal  assistance. 
Examples  of  the  uses  include: 

•  Educational  programs  (vocational  education  grants, 
library  grants,  and  others),  obligations  of  $1.5  billion  in 
fiscal  1979 

•  Public  Health  Service  programs,  $0.5  billion 

•  Medicaid,  $11.8  billion 

The  Medicaid  program  is  an  example  of  several  of  the 
largest  Federal  assistance  programs  which  (on  either  a 
required  or  an  optional  basis)  utilize  BEA's  State  per  capita 
income  estimates  to  determine  the  matching  requirement  for 
the  State's  contribution  to  programs. 

The  BEA  develops  the  per  capita  income  figures  by  first 
estimating  total  personal  income  and  then  dividing  it  by 
population  estimates  supplied  by  the  Census  Bureau.  The 
personal  income  estimates  are  derived  from  sources  inde- 
pendent of  the  population  census;  about  90  percent  of  the 
information  comes  from  administrative  records  such  as 
unemployment  insurance,  social  security,  and  income  tax 
records.  Although  the  data  sources  contain  various  imperfec- 
tions, their  coverage  is  much  more  complete  than  income 
estimates  obtained  from  census  or  household  survey  data. 
Dividing  these  data  by  a  population  estimate  that  reflects 
census  undercoverage  has  the  effect  of  overstating  income 
per  capita.  Geographically,  the  extent  of  overstatement  may 
be  assumed  to  be  approximately  proportional  to  the  degree 
of  census  undercoverage  in  the  particular  geographic  afea. 

The  BEA  has  made  considerable  effort  to  identify  the 
ways  in  which  Federal  programs  use  these  per  capita  income 
estimates.  This  in  itself  has  not  been  easy.  There  is  no 
systematic  process  by  which  data  users  notify  data  producers 
of  their  interest  in  and  use  of  data.  Hence,  one  can  never  be 
sure  that  all  the  uses  have  been  identified. 

As  far  as  I  can  discover,  no  studies  have  been  made  which 
attempt  to  trace  through  the  impact  of  census  undercoverage 
on  programs  using  the  per  capita  income  data.  This  would 
seem  to  be  an  area  crying  for  investigation. 

On  can  hypothesize  that  census  undercoverage  is  greatest 
in  low-income  areas;  that  overstatement  of  per  capita  income 
therefore  is  greatest  in  low-income  areas;  and  that  the 
programs  designed  to  reach  low-income  populations,  there- 
fore, contain  a  systematic  bias  against  distributing  funds 
where  they  are  most  needed. 

A  hypothesis  is  a  proposition  to  be  tested;  it  is  not  a 
conclusion.  The  fact  that  the  above  hypothesis  sounds 
plausible  does  not  make  it  correct.  There  is  evidence  that 
census  undercoverage  is  greater  in  low-income  areas.  How- 
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ever,  the  remainder  of  the  above  hypothesis  has  not  been 
tested.  It  cannot  be  assumed  that  it  is  correct.  The  studies 
which  have  been  made  of  the  impact  of  undercoverage  on  the 
revenue-sharing  program  and  on  unemployment  rates  have 
yielded  results  which  sometimes  conflict  with  earlier  intuitive 
assumptions.  Study  of  the  effect  of  census  undercoverage  on 
programs  using  BEA  per  capita  income  data  is  appropriate; 
conclusions  about  who,  if  anyone,  is  being  shortchanged 
would  be  premature. 

CONCLUDING  THOUGHTS 

My  first  reaction  in  attempting  to  structure  an  examina- 
tion of  census  undercoverage  on  Federal  programs  is  a  feeling 
of  overwhelming  confusion.  There  are  a  multitude  of 
programs;  each  one  is  different;  each  one  is  complicated. 
More  investigation  and  more  thought  are  needed  before 
forming  a  judgment  as  to  whether  adjustment  of  census  data 
for  purposes  of  program  use  is  feasible  or  would  contribute 
significantly  to  the  achievement  of  program  goals. 

There  are  certain  basic  points  that  should  be  kept  in  mind 
as  study  and  discussion  go  forward. 

1.  We  need  to  be  concerned,  not  just  with  population 
counts,  but  with  socioeconomic  characteristics  of  the 
population.  For  obtaining  reliable  information  on 
population  characteristics,  statistical  adjustment  is  no 
substitute  for  an  effort  to  obtain  as  complete  a  census 
count  as  possible.  I  stress  this  obvious  point  because  it 
has  from  time  to  time  been  suggested  that  the 
Government  could  save  a  good  deal  of  money  by 
making  less  effort  to  achieve  a  complete  count  and 
placing  more  reliance  on  statistical  adjustment.  If  we 
needed  only  a  population  count,  this  might  be  correct. 
However,  program  uses  of  census  data  demand  know- 
ledge of  population  characteristics  very  difficult  to 
impute  accurately  to  uncounted  people.  The  Census 
Bureau  has  been  correct  in  placing  its  primary 
emphasis  on  obtaining  improved  census  coverage.  None 
of  us  should  allow  ourselves  to  be  distracted  from 
efforts  to  assist  in  achieving  this. 

2.  We  are  concerned  not  just  with  direct  use  of  census 
data  or  even  with  direct  use  of  postcensal  estimates. 
The  census  data  enter  into  other  statistics  used  in 
Federal  program  administration.  Although  these  uses 
are  well  known,  there  has  been  little  examination  of 
the  impacts  of  census  undercoverage.  To  focus  only  on 


the  obvious  direct  uses  of  census  data  is  to  fail  to  see 
the  forest  for  the  trees.  Such  limited  vision  could  lead 
to  poor  decisions  on  the  question  of  undercount 
adjustment. 

3.  In  reality,  the  impacts  of  undercoverage  may  be 
different  than  we  think.  More  accurate  data  is  a 
desirable  goal  in  and  of  itself,  and  adjustment  for 
undercoverage  certainly  deserves  the  study  and  atten- 
tion it  is  receiving  on  that  ground  alone.  However, 
those  whose  concern  about  undercoverage  stems  from 
the  assumption  that  those  for  whom  they  speak  are 
being  shortchanged  on  Federal  program  assistance 
might  be  well  advised  to  look  more  carefully  at  the 
extent  to  which  this  is  or  is  not  true  on  a  compre- 
hensive basis. 

4.  Finally,  while  data  producers  must  strive  to  meet  the 
needs  of  data  users,  data  users  also  have  some 
responsibility  to  try  to  avoid  impossible  demands. 
Many  of  our  funding  programs  demand  degrees  of 
detail  in  terms  of  geography  and  of  population 
characteristics  which  cannot  be  reliably  produced  by 
any  known  method.  Also,  programs  come  and  go  and 
change  their  design  with  considerable  rapidity.  The 
broad  design  of  data-collection  programs  should 
respond  to  user  needs,  but  it  would  scarcely  be  good 
policy  to  build  every  detail  of  census  data  collection 
around  continuously  shifting  and  sometimes  poorly 
conceived  user  demands.  Along  with  considering 
undercount  adjustment  to  meet  program  needs,  we 
also  should  consider  redesigning  program  data  require- 
ments to  conform  to  the  data  reasonably  likely  to  be 
available  from  a  good  data-collection  program. 
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Billions  of  dollars  from  the  Federal  Government  are  distri- 
buted annually  among  State  and  local  governments  on  the 
basis  of  their  population  size.  In  addition  to  Federal  funds, 
State  governments  also  distribute  revenues  to  their  localities 
on  the  basis  of  population  size.  Although  there  are  no  cur- 
rently precise  estimates,  it  is  accurate  to  conclude  that  liter- 
ally tens  of  billions  of  Federal  and  State  dollars  are  distri- 
buted on  the  basis  of  population. 

This  paper  considers  the  impact  of  a  census  undercount 
on  this  distribution  process.  It  looks  at  some  specific  pro- 
grams, identifies  potential  losers  and  gainers,  and  analyzes 
the  equity  of  readjustment  of  the  census  for  the  undercount. 

POPULATION  AS  A  FACTOR  IN  FEDERAL 
AND  STATE  TRANSFERS 

Population  is  the  most  often  used  factor  in  formulas  for 
the  distribution  of  Federal  assistance  to  State  and  local 
governments.  More  than  half,  83  of  146,  of  the  Federal  cate- 
gorical programs  use  some  form  of  population  as  a  factor 
in  their  distribution  formulas.  Total  population  is  used  in  28, 
while  some  specific  segment  of  total  population  is  used  in 
the  other  55.  Population-based  programs  cover  a  wide  range 
of  government  functions  including  education,  housing  and 
community  development,  criminal  justice,  employment  and 
training,  social  welfare,  transportation,  water,  and  sewer  [1  ] . 

In  addition  to  appearing  in  distribution  formulas,  popula- 
tion also  determines  the  eligibility  of  a  jurisdiction  for  assis- 
tance. In  the  Comprehensive  Employment  and  Training  Act 
of  1973  (CETA),  for  example,  State  and  local  governments 
with  a  population  of  at  least  100,000  qualify  as  prime  spon- 
sors. Smaller  jurisdictions  may  qualify  as  a  group  numbering 
100,000  in  population,  or  if  specifically  authorized  by  the 
Secretary  of  Labor. 

In  some  programs,  population  plays  a  role  in  determining 
distribution  because  the  variables  used  are  derivatives  of 
population.  This  is  true  of  those  formulas  that  use  per  capita 
data  or  local  employment  or  unemployment  rates.  For  cities 
and  counties,  these  rates  are  determined  on  the  basis  of 
population.  Thus,  while  CETA  titles  II,  IV,  and  VI  do  not 
explicitly  include  population,  the  allocation  of  funds  through 
these  titles  is  affected  by  the  accuracy  of  the  population 
count  as  well  as  the  count  of  the  low-income  population, 
which  is  a  factor  in  the  allocation  of  titles  II  and  IV  funds. 

Population  and  State  Distribution  of  Aid 

Population  also  plays  an  important  role  in  determining  the 
distribution  of  funds  from  States  to  their  localities.  Some 


State  aid  programs  specifically  require  the  use  of  census  pop- 
ulation counts.  This  is  the  case  for  the  cigarette,  gas,  and 
liquor  taxes  in  the  State  of  Oregon.  In  other  cases,  however, 
alternative  data  may  be  used.  Again,  in  the  State  of  Oregon, 
State  revenue  sharing  is  based  on  population  data  provided 
by  the  demographic  unit  at  Portland  State  University.  Those 
data  are  certified  by  the  State  and  are  used  to  distribute 
revenue-sharing  funds  among  the  localities. 

Where  alternative  data  are  admissible,  the  impact  of  the 
undercount  might  be  diminished.  Yet,  it  is  to  be  understood 
that  even  by  using  their  own  updates  of  population,  jurisdic- 
tions do  not  thoroughly  escape  the  impact  of  the  census 
count.  The  State  of  New  Jersey  has  estimated  its  population 
by  jurisdiction  for  1977  and  1978,  for  example,  using 
alternative  methods.  The  housing-unit  method  of  estimation 
it  uses  is  very  much  grounded  in  the  1970  census  count,  even 
though  these  figures  are  adjusted.  Often  the  census  count  is 
the  beginning  point  or  the  control  used  in  alternative  esti- 
mates by  the  Federal  Government,  jurisdictions,  or  private 
firms. 

Education  accounts  for  60  percent,  highways  for  6  per- 
cent, and  general  government  for  10  percent  of  State  aid 
to  local  governments  [11].  From  afunctional  point  of  view, 
general  local  government  support,  followed  by  aid  for  high- 
ways, are  the  areas  in  which  the  dollar  volume  of  State  assis- 
tance to  local  governments  is  most  tied  to  total  population 
size.  These  are  followed  by  education,  which,  although 
accounting  for  nearly  60  percent  of  State  aid  to  localities, 
has  its  aid  determined  not  on  the  basis  of  total  population 
but  on  school-age  population. 

Highway  aid  from  States  to  local  governments  is  sensitive 
to  population  counts,  in  part,  because  of  the  way  aid  based 
on  receipts  from  the  motor  vehicle  fuel  sales  tax  is  distri- 
buted. The  majority  of  the  States  with  such  a  tax  distribute 
the  receipts  by  using  population  as  one  of  the  factors  in  the 
formula.  Exceptions  to  this  rule  include  the  States  of  New 
York,  Louisiana,  Maryland,  and  Virginia. 
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The  sensitivity  of  general  local  government  support  to  the 
population-size  factor  is  often  a  reflection  of  how  some 
consumption  or  nuisance  taxes  are  distributed.  In  Maryland, 
receipts  from  the  parimutuel  and  cigarette  taxes  are  distri- 
buted to  counties  and  the  city  of  Baltimore  on  the  basis  of 
population.  Louisiana  distributes  receipts  from  the  cigarette 
tax  also  on  the  basis  of  population.  Minnesota,  Nevada, 
Oregon,  Indiana,  and  Kansas  distribute  revenues  from  this 
tax  as  well  as  the  alcoholic  beverage  tax  on  the  basis  of  pop- 
ulation. 

The  alcoholic  beverage  tax  is  also  used  for  general  support 
of  local  governments  and  distributed  on  the  basis  of  popula- 
tion in  the  States  of  Indiana,  Alabama,  North  Carolina, 
South  Dakota,  Oklahoma,  Rhode  Island,  South  Carolina, 
Tennessee,  Utah,  Virginia,  Wisconsin,  and  Washington. 

Adequacy  of  the  Population  Variable 

The  use  of  population  as  a  factor  for  distributing  Federal 
funds  to  State  and  local  governments  as  well  as  from  States 
to  their  localities  is  based  on  a  number  of  assumptions.  It  is 
assumed  that  population  size  is  an  indicator  of  cost.  There  is 
some  strong  empirical  evidence  to  suggest  that  cost  is  related 
to  population  size  of  government.  However,  the  mathema- 
tical relationship  is  nonlinear— not  linear,  as  is  frequently 
assumed  in  State  aid  formulas  [7] . 

It  is  also  assumed  that  population  size  is  related  to  need, 
and  that  the  population  variable  is  a  workable  proxy  for  local 
need.  More  precisely,  it  is  assumed  that  a  larger  population 
implies  a  greater  need.  Yet,  80  percent  of  the  50  cities  with 
the  highest  rates  of  poverty  are  small  cities  with  a  population 
of  100,000  or  less  [3] .  The  issue  here  is  not  only  that  need 
has  various  facets,  but  that  we  must  distinguish  between  the 
intensity  of  a  need  and  its  scale.  From  the  above  example 
of  poverty,  it  is  clear  that  the  intensity  of  poverty  is  greater 
in  small  cities,  but  the  scale  is  greater  in  larger  cities. 

Needs  have  several  dimensions,  not  all  of  which  are 
equally  represented  by  population  size  [4] .  Recognizing  this, 
some  have  suggested  that  other  variables,  which  are  better 
and  direct  representation  of  the  need  being  addressed,  should 
be  substituted  for  population.  This  argument  has  at  least 
three  practical  weaknesses  recognized  even  by  advocates 
of  alternative  data.  First,  the  data  for  many  of  these  other 
variables  are  not  available  as  often  or  in  as  uniform  manner 
as  population.  Second,  many  of  these  variables  are  subject 
to  their  own  measurement  errors.  Thus,  for  example,  the 
rates  of  employment  derived  from  industry  or  establishment 
data  do  not  reflect  self-employment  or  employment  in  the 
farm  sector.  They  must  be  adjusted.  Third,  some  of  these 
variables  are  themselves  derivatives  of  population  and, 
consequently,  are  not  immune  to  the  undercount.  A  case  in 
point  is  the  unemployment  or  employment  data  used  to 
allocate  CETA  funds.  These  data  are  derived  from  a  tortuous 
series  of  adjustments  that  lead  to  estimates  for  labor  market 
areas.  A  county's  share  of  the  labor  market's  unemployment 


of  new  entrants  is  based  on  its  share  of  the  market's  popula- 
tion of  persons  aged  14  to  19  in  1970.  The  county's  share  of 
unemployed  reentrants  is  based  on  its  share  of  the  market's 
population  20  years  old  and  over  in  1970.  (Its  share  of  the 
unemployment  of  the  experienced  labor  force  is  based  on 
its  proportion  of  recipients  of  benefits  during  the  reference 
period.)  The  city's  share  of  the  county's  unemployed  and 
employed  is  based  on  its  share  of  these  two  factors  in  the 
county  in  1970.  A  census  that  severely  undercounts  blacks 
and  Hispanics  would  underestimate  unemployment  (a  key 
factor  in  the  CETA  allocation  criteria)  in  those  cities  with 
a  high  proportion  of  blacks  and  Hispanics.  The  undercount 
in  1970  was  most  severe  within  the  black  and  presumably 
Hispanic  labor-force  age  range. 

Studies  have  shown  that  even  those  formulas  that  weigh 
population  heavily  lead  to  a  reasonably  efficient  distribution 
of  Federal  funds— to  the  extent  that  efficiency  means  target- 
ing to  those  areas  of  greatest  need,  where  need  is  defined  as  a 
composite  of  socioeconomic  variables  [6]  .  Admittedly,  some 
of  these  composite  indicators  of  need  include  some  aspect 
of  population.  Population  therefore  appears  in  some  form  in 
both  sides  of  the  equation;  e.g.,  in  defining  need  and  in  deter- 
mining the  distribution  of  funds.  In  particular,  population 
often  appears  as  a  change  variable.  It  is  assumed  that  "de- 
clining" cities  have  different  needs  than  "growing"  cities. 
Population  decline  is  highly  correlated  with  fiscal  strain, 
the  age  of  the  housing  stock,  and  the  conditions  of  the 
infrastructure  [2] . 

Limits  of  the  Population  Variable  in  the 
Distribution  Process 

It  would  appear  that,  pari  passu,,  an  undercount  would 
lead  to  an  erroneous  distribution  of  funds  in  those  instances 
where  population  is  a  factor  in  the  distribution  formula  or 
is  used  as  a  major  element  in  determining  eligibility.  The  im- 
pact of  the  undercount  is  limited  by  several  factors.  First, 
population  is  rarely  the  sole  determinant  of  either  eligibility 
or  the  amount  of  funds  a  jurisdiction  gets  based  on  the  use 
of  a  formula.  Yet,  it  usually  has  a  very  strong  influence  be- 
cause of  a  large  variance,  i.e.,  cities  vary  more  according  to 
population  and  population  growth  than  along  other  variables. 
Second,  the  population  variable  may  be  highly  correlated 
with  the  other  variables  in  the  formula. 

Third,  many  programs  have  limits— minima  and  maxima— 
that  in  turn  limit  the  impact  of  population.  Fourth,  some 
programs  have  alternative  formulas  in  which  population  is 
weighted  differently,  and  a  jurisdiction  has  the  right  to  use 
the  formula  that  gives  it  the  most  favorable  treatment.  Fifth, 
population  is  used  in  some  formulas  as  an  "impaction"  or 
relative  variable,  so  that  it  is  not  the  absolute  but  the  relative 
level  (used  as  a  weight)  that  matters.  Sixth,  some  programs 
provide  the  authority  and  funds  to  the  Secretary  for  making 
adjustments  necessary  because  of  errors  in  the  formula; 
presumably,  these  funds  could  be  used  to  compensate  for 
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the  undercount.  Seventh,  the  effects  of  an  undercount  can 
be  offset  where  a  consortium  is  possible.  CETA  provides 
for  meeting  the  100,000-population  minimum  through  a 
consortium.  Eighth,  often  the  population  variable  is  only  a 
segment  of  the  total  population;  unfortunately,  the  target 
population  is  frequently  one  that  is  likely  to  be  seriously 
undercounted.  This  is  the  case  where  low-income  population 
is  a  target.  It  is  also  the  case  where  labor-force  information 
is  important  because  the  undercount  rate  of  black  males  of 
labor-force  age  is  extremely  high.  Ninth,  alternative  methods 
and  data  used  to  estimate  population  in  nondecennial  years 
differ  in  the  reflection  of  the  undercount. 

ASSESSING  THE  IMPACT  OF  AN  UNDERCOUNT 
ON  FEDERAL  TRANSFERS 

In  the  preceding  sections,  we  established  that  population 
size  (total  and  by  segments)  is  an  important  variable  in  al- 
locating State  and  Federal  Government  assistance,  and  that 
these  funds  go  toward  financing  vital  State  and  local  func- 
tions. In  this  section,  we  look  at  the  impact  of  the  under- 
count on  specific  programs.  Because  each  program  uses 
population  differently,  we  can  only  obtain  an  accurate  assess- 
ment of  the  impact  of  population  on  a  recipient  by  looking 
at  each  program  separately.  To  fully  appreciate  the  under- 
count and  its  effects  on  each  of  these  programs,  we  must  go 
through  the  tedious  exercise  of  understanding  the  allocation 
process  of  each  program. 

Sometimes  knowing  the  Federal  allocation  formula  is  not 
enough  to  determine  the  losers  and  gainers  of  an  undercount. 
This  is  the  case  where  Federal  assistance  to  States  amounts  to 
a  pass-through  of  aid  to  localities  and  where  States  and 
localities  are  permitted  to  set  up  independent  distribution 
procedures.  Title  I  of  the  Elementary  and  Secondary  Educa- 
tion Act  of  1965  as  amended,  for  example,  distributes  funds 
to  counties  mostly  on  the  number  of  children  5  to  17  years 
of  age  who  are  determined  to  be  from  poor  families.  How  the 
counties  and  States  distribute  these  funds  to  school  districts 
may  differ  from  State  to  State  [8] .  In  other  cases,  while  aid 
goes  to  a  locality,  the  real  losers  are  a  specific  class  of  resi- 
dents in  that  locality  rather  than  the  jurisdiction  in  general. 

The  number  of  programs  using  population  is  too  great 
for  us  to  consider  each  independently.  For  this  reason,  in 
this  section  we  shall  concentrate  on  three  programs:  General 
Revenue  Sharing,  the  Community  Development  Block 
Grant,  and  the  Farmers  Home  Administration  Housing 
Program  (Section  502). 

General  Revenue  Sharing 

The  State  and  Local  Fiscal  Assistance  Act  of  1972  pro- 
vides for  the  distribution  of  $55  billion  to  State  and  local 
governments.  Some  39,000  units  of  government  use  these 
funds  for  a  variety  of  functions  ranging  from  police  to 
education.  The  main   ingredient  of  this  program  is  that  it 


provides  a  minimum  of  strings  and  a  maximum  of  flexibility 
of  use  by  the  recipients. 

Allocation  formula.  Each  year,  funds  are  allocated  first 
among  States  using  a  five-factor  or  a  three-factor  formula, 
whichever  gives  the  individual  State  the  greater  allotment. 
The  five-factor  formula  is  based  on  population,  urbanized 
population,  population  inversely  weighted  for  per  capita 
income,  income  tax  collections,  and  general  tax  effort.  The 
three-factor  formula  uses  population,  general  tax  effort,  and 
relative  income. 

Once  the  distribution  among  States  is  accomplished,  one- 
third  of  each  State's  allocation  is  awarded  to  the  State  gov- 
ernment for  its  use.  The  remaining  two-thirds  is  divided  by 
the  State  population  size  to  obtain  an  average  per  capita 
grant  for  that  State.  As  we  shall  see,  this  figure  provides  an 
upper  limit  to  the  allocation  for  each  jurisdiction. 

The  local  government  pool  for  each  State  is  distributed 
among  county  areas  (not  necessarily  county  governments) 
on  the  basis  of  relative  population,  general  tax  effort,  and 
relative  income  of  each  area.  No  county  area  may  obtain 
more  than  145  percent  or  less  than  20  percent  of  the  per 
capita  figure  referred  to  above. 

Each  county  area  allocation  is  divided  up  such  that  an 
amount  goes  to  Indian  tribal  governments  on  the  basis  of 
their  population  relative  to  the  population  of  the  total 
county  area.  From  the  balance,  a  county  government's  allo- 
cation is  determined  on  the  basis  of  the  government's  ad- 
justed taxes  in  the  county  area.  No  county  government  is 
allocated  more  than  50  percent  of  its  adjusted  taxes  and 
transfers.  The  remaining  portion  is  allocated  among  cities 
and  townships  on  the  basis  of  a  single  formula,  using  popula- 
tion, general  tax  effort,  and  relative  income.  No  unit  of  local 
government  or  township  may  receive  more  than  the  145- 
percent  upper  limit  or  less  than  the  20-percent  lower  limit 
of  the  average  per  capita  allotment  referred  to  above.  If  a 
unit  receives  less  than  the  20  percent,  its  allotment  is  in- 
creased to  either  the  20-percent  level  or  50  percent  of  its 
adjusted  taxes  and  transfers,  whichever  is  lower. 

Impact  of  an  undercount.  Based  on  the  above  procedure, 
it  is  clear  that  the  allocation  of  general  revenue  sharing  funds 
is  tied  to  an  accurate  count  of  total  as  well  as  urbanized 
population  in  each  State.  For  the  State,  the  impact  of  the 
undercount  can  be  reduced  somewhat  by  moving  from  one 
formula  to  the  other.  Note,  however,  that  this  option  is 
not  available  to  the  localities  in  the  State. 

The  major  defense  builtin  for  localities  is  the  20-percent 
or  50-percent  lower  limit.  As  long  as  these  lower  limits  are 
not  reached,  the  only  hope  of  a  locality  which  has  been 
undercounted  is  that  some  other  jurisdiction  has  surpassed 
its  145-percent  limit,  thereby  triggering  a  redistribution  of 
the  excess.  On  the  other  hand,  a  corrected  count  would  not 
help  a  locality  which  has  been  undercounted  if  that  locality 
has  already  surpassed  its  upper  limit. 
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Assuming  that  a  locality  has  experienced  an  undercount 
and  would  not  be  at  its  limits,  that  locality  would  be  a  loser 
as  a  result  of  the  undercount.  The  State  also  might  be  a  loser, 
providing  that  its  population  count  reflects  the  undercount. 
This,  of  course,  depends  upon  the  method  used  to  reconcile 
differences  between  the  State  count  and  the  counts  of  each 
of  its  components.  In  any  event,  there  are  more  buffers  for 
the  State  than  for  the  locality,  although  both  could  con- 
ceivably be  losers  as  a  result  of  the  undercount. 

If  the  locality's  undercount  is  not  expressed  in  the  State 
figure,  then  any  adjustment  to  the  locality's  figure  results 
in  an  intrastate  redistribution.  One  study  has  shown  that  for 
the  States  of  New  Jersey  and  Virginia,  such  intrastate  redis- 
tribution results  in  no  more  than  a  5-percent  change  in  the 
allotment  of  all  jurisdictions  in  the  State  [1 0] .  No  jurisdiction 
gained  more  and  none  lost  more  than  5  percent.  It  should 
be  noted,  however,  that  this  is  a  one-time  loss.  To  the  extent 
that  the  undercount  is  reflected  in  annual  allotments  through 
estimates  using  the  undercounted  population  as  a  base  (which 
is  true  of  most  methods  of  population  estimates),  the  under- 
counted  entity  experiences  an  annual  loss  until  a  true  count 
is  obtained. 

To  the  extent  that  the  undercount  is  reflected  in  the 
State  population  total,  an  adjustment  will  result  in  an  inter- 
state distribution  of  funds.  The  extent  to  which  there  are 
gains  and  losses  in  an  interstate  distribution  is  not  bounded 
by  limits  such  as  the  per  capita  entitlement,  which  limits 
intrastate  allocations.  Yet,  at  least  one  study  has  shown 
that  the  gains  and  losses  resulting  from  a  readjustment  of 
State  figures  will  hardly  approach  4  percent  [9]  - 

In  the  case  of  general  revenue  sharing,  the  losers  and 
gainers  are  jurisdictions  (States  and  local  governments)  as 
opposed  to  identifiable  individuals  or  an  identifiable  class  of 
individuals  or  jurisdictions.  This  is  so  because  general  revenue 
sharing  allotments  enter  local  budgets  unearmarked  for 
specific  beneficiaries.  This  is  not  the  case  in  all  programs. 

The  Community  Development  Block  Grant 

The  Community  Development  Block  Grant  (CDBG) 
program  combines  several  former  categorical  programs  ad- 
ministered by  the  Department  of  Housing  and  Urban  Devel- 
opment. These  categorical  programs  included  funds  for  urban 
renewal,  neighborhood  development,  water  and  sewer 
systems,  and  "open  space";  public  facility  loans;  neighbor- 
hood facilities  grants;  and  Model  Cities  supplemental  grants. 
Assistance  from  these  programs  was  obtained  on  a  com- 
petitive basis  with  no  assurance  of  long-term  funding.  The 
recipients  of  these  awards  were  also  required  to  follow 
detailed  rules  and  regulations.  Often,  local  priorities  were 
sacrificed  in  an  attempt  to  obtain  assistance  and  to  imple- 
ment these  programs.  Since  its  beginning,  some  $15.5  billion 
have  been  appropriated  to  CDBG. 

The  general  objective  of  the  Community  Development 
Block  Grant  program  is  to  provide  for  viable  communities 


through  improved  housing  and  neighborhood  conditions  and 
through  expanded  economic  opportunities  primarily  for  low- 
and  moderate-income  persons.  Specifically,  the  1974  act,  as 
amended,  aims  to: 

1 .  Eliminate  and  prevent  slums  and  blight; 

2.  Eliminate  conditions  detrimental  to  health,  safety,  and 
public  welfare; 

3.  Conserve  and  expand  the  Nation's  housing  stock; 

4.  Expand    and    improve    the    quantity   and   quality   of 
community  services; 

5.  Improve   the  rational  use  of  land  and  other  natural 
resources; 

6.  Reduce    the  concentration   and    isolation   of   income 
groups  in  specific  neighborhoods; 

7.  Restore  and  preserve  historic  sites;  and 

8.  Alleviate  physical  and  economic  distress. 

As  a  mechanism  for  meeting  these  national  objectives,  the 
Community  Development  Block  Grant,  unlike  its  prede- 
cessors, is  designed  to  give  considerable  discretion  to  local 
governments.  These  governments  may  choose  among  the 
eight  national  objectives  in  a  manner  that  respects  their  local 
conditions  and  the  desire  of  local  citizens.  In  addition, 
unlike  its  predecessors,  theGDBG  program  provides  fora  pre- 
dictable flow  of  funds  to  local  governments  on  an  annual 
basis. 

Allocation  formula.  The  allocation  of  the  CDBG  funds 
occurs  in  the  following  manner:  80  percent  of  the  annual 
appropriation  is  set  aside  for  metropolitan  areas  and  20  per- 
cent for  nonmetropolitan  areas. 

The  metropolitan  allotment  is  divided  between  urban 
counties  and  entitlement  cities.  In  order  to  be  eligible,  an 
urban  county  must  have  a  population  of  at  least  200,000 
residents  (excluding  central  cities)  and  have  the  authority 
to  conduct  community  development  activities.  A  metro- 
politan entitlement  city  must  have  a  population  of  at  least 
50,000. 

The  entitlement  to  each  metropolitan  city  is  based  on 
the  greater  of  the  amounts  derived  from  one  of  two  formu- 
las. The  first  formula  uses  the  population  of  the  city  relative 
to  the  population  of  all  metropolitan  areas,  the  extent  of 
poverty  in  the  city  relative  to  all  metropolitan  areas,  and  the 
extent  of  housing  overcrowding  in  the  city  relative  to  all 
metropolitan  areas.  In  this  formula,  population  and  over- 
crowding each  get  25  percent  of  the  weight,  while  poverty 
gets  50  percent. 

The  second  formula  uses  the  extent  of  population  growth 
lag  in  the  metropolitan  city  relative  to  all  metropolitan 
areas.  Growth  lag  is  the  difference  between  the  actual  growth 
of  an  area  (using  1960  population  and  boundaries  as  a  base) 
and  the  growth  it  would  have  experienced  had  it  grown  at 
the  same  rate  as  similar  areas  in  the  Nation  as  a  whole.  The 
formula  also  uses  the  extent  of  poverty  in  the  city  relative 
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to  all  metropolitan  areas  and  the  age  of  housing  in  the  city 
relative  to  all  metropolitan  areas.  In  this  formula,  population 
growth  is  assigned  20  percent  of  the  weight,  poverty  30 
percent,  and  age  of  housing,  50  percent. 

Similarly,  an  urban  county's  entitlement  is  calculated  by 
a  dual  formula  in  which  the  variables  given  above  are  ex- 
pressed as  the  conditions  in  that  urban  county  relative  to 
all  metropolitan  areas  and  in  which  the  weights  are  the  same. 
Like  the  metropolitan  city,  the  urban  county  chooses  the 
formula  that  gives  it  the  greater  entitlement. 

The  law  also  provides  an  allotment  that  would  protect 
cities  that  have  a  smaller  entitlement  than  they  historically 
received  from  the  categorical  programs  that  were  combined 
into  CDBG.  Once  the  entitlements  have  been  determined  and 
the  "hold-harmless"  protection  has  been  arrived  at,  the 
so-called  metropolitan  discretionary  balance  is  allocated  for 
the  direct  use  of  smaller  metropolitan  cities  and  counties  or 
to  the  States  for  use  within  metropolitan  areas. 

The  allocations  of  the  discretionary  balance  are  derived 
from  a  formula  that  is  based  on  the  population  of  the  metro- 
politan areas  in  a  State  relative  to  the  population  of  metro- 
politan areas  of  all  States,  the  extent  of  poverty  in  the 
metropolitan  areas  of  a  State  relative  to  poverty  in  the 
metropolitan  areas  of  all  States,  and  housing  overcrowding 
in  the  metropolitan  areas  of  a  State  relative  to  such  over- 
crowding in  the  metropolitan  areas  of  all  States.  The  poverty 
factor  gets  50  percent  of  the  weight  in  this  formula,  while 
the  population  and  overcrowding  each  get  25  percent. 

An  alternative  formula,  which  may  be  used  in  allocating 
this  metropolitan  discretionary  balance,  uses  the  age  of 
housing,  the  extent  of  poverty,  and  population  in  the  metro- 
politan area  of  the  State  as  these  factors  compare  to  the 
metropolitan  areas  in  all  States.  These  variables  are  expressed 
as  relatives  (as  in  the  first  formula),  but  the  weights  are  dif- 
ferent. Age  of  housing  gets  50  percent  of  the  weight,  poverty 
gets  30  percent,  and  population  gets  20  percent. 

The  80  percent  set  aside  for  metropolitan  areas  is  distri- 
buted according  to  the  procedures  described  above.  The  20 
percent  set  aside  for  nonmetropolitan  areas  is  distributed  in 
essentially  two  steps.  The  first  meets  the  hold-harmless  com- 
mitments to  cities  in  nonmetropolitan  areas.  Once  this  has 
been  done,  the  balance  (the  nonmetropolitan  discretionary 
balance)  is  allocated  among  States  for  use  in  nonmetropolitan 
areas  on  the  basis  of  one  of  two  formulas,  whichever  gives 
the  greater  amount  to  the  nonmetropolitan  areas  of  the 
State.  The  first  formula  uses  the  population  of  nonmetro- 
politan areas  of  all  States,  and  the  extent  of  housing  over- 
metropolitan  areas  of  a  State  compared  with  the  nonmetro- 
politan area  of  all  States,  and  the  extent  of  housing  over- 
crowding in  the  nonmetropolitan  area  in  the  State  compared 
with  the  nonmetropolitan  areas  of  all  States.  In  this  formula- 
tion, poverty  is  assigned  50  percent  of  the  weight,  and 
population  and  housing  overcrowding  are  each  assigned  25 
percent. 

The  alternative  formula  uses  the  age  of  housing  in  the 


nonmetropolitan  area  in  the  State  compared  with  similar 
areas  in  all  States,  the  extent  of  poverty  in  the  nonmetro- 
politan area  of  the  State  compared  to  all  States,  and  the 
population  of  the  nonmetropolitan  area  of  the  State  com- 
pared to  the  same  areas  in  all  States.  In  this  formulation, 
the  age  of  housing  is  assigned  50  percent  of  the  weight, 
poverty,  30  percent,  and  population,  20  percent. 

Impact  of  an  undercount.  From  the  description  of  the 
allocation  procedure,  it  is  obvious  that  an  undercount  by  any 
single  jurisdiction,  whether  or  not  it  qualified  for  an  auto- 
matic entitlement  for  CDBG  funds,  affects  not  only  that 
jurisdiction  but  all  others  in  the  State  as  a  whole.  This  is 
because  of  the  metropolitan  and  nonmetropolitan  discre- 
tionary balances  which  are  based  on  the  populations  of  these 
areas  in  each  State.  Hence,  even  if  a  small  nonmetropolitan 
city  is  not  interested  in  being  a  recipient  of  such  assistance, 
its  population  figure  is  used  in  determining  the  overall  allo- 
cation to  its  State  for  nonmetropolitan  areas  in  that  State. 
A  similar  situation  holds  for  metropolitan  cities. 

The  use  of  alternative  formulas  in  which  population  is 
assigned  different  weights  gives  the  localities  a  way  of  re- 
ducing (but  not  necessarily  eliminating)  the  loss  due  to  a 
population  undercount.  Recall  that,  in  the  case  of  general 
revenue  sharing,  only  States  had  this  protection. 

In  the  first  formula,  an  undercount  will  simply  lead  to  a 
reduced  entitlement— presuming  that  a  city  is  not  denied 
entitlement  status  because  of  an  undercount  that  puts  it 
below  50,000.  But,  this  is  not  necessarily  the  case  in  the 
second  formula.  This  second  formula  is  particularly  impor- 
tant, since  it  is  the  one  used  by  most  distressed  cities— those 
large  cities  that  have  slow  or  negative  growth  rates  and  a 
variety  of  other  socioeconomic  symptoms  of  distress.  This 
formula,  largely  because  of  the  population  growth-lag 
variable,  effectively  concentrates  substantial  CDBG, aid  on 
these  cities  [5] .  To  understand  this,  let's  work  with  a 
simple  paradigm,  which  is  illustrated  in  the  appendix.  Let 
us  look  at  a  city  that  suffers  an  undercount  in  the  base  year 
and  ask  what  that  undercount  would  do  to  that  city's 
entitlement,  assuming  alternatively  a  city  that  has  a  growing, 
stable,  or  declining  population.  We  shall  also  assume  that  the 
Census  Bureau  has  achieved  an  accurate  count  in  the  current 
year. 

An  undercount  in  the  base  year  (1970)  and  an  accurate 
count  in  the  current  year  (1980)  will  hurt  a  growing  city 
because  it  would  exaggerate  its  true  rate  of  growth  and  could 
lead  to  a  zero  value  for  that  jurisdiction  as  far  as  the  growth- 
lag  variable  is  concerned.  The  alternative  for  that  city  is  to 
shift  to  the  first  formula,  where  population  size  is  the  factor. 
But  this  may  be  no  more  than  a  loss  minimization  procedure 
rather  than  one  that  will  yield  no  loss  at  all. 

For  a  declining  city,  an  undercount  in  the  base  year  with 
an  accurate  count  in  the  current  year  would  hurt  if  the  cur- 
rent count  is  above  the  base-year  figure.  This  situation  will 
result  in  the  city's  registering  a  growth  rather  than  a  decline. 
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The  loss  would  be  less  severe  if  the  city  declines  below  the 
incorrectly  enumerated  level  in  the  base  year,  for  then  it 
would  be  registered  as  a  declining  city  with  a  definite  growth 
lag.  Even  in  this  case,  the  city  suffers  a  loss  because  its  de- 
cline is  undercounted. 

If  the  city  is  stable,  under  the  same  assumptions  in  the 
above  paragraphs,  it  will  be  hurt.  This  is  so  because  the  city 
will  be  shown  as  a  growth  city. 

Let  us  turn  to  the  case  where  the  count  in  the  base  year 
was  accurate,  but  there  is  an  undercount  in  the  current  year. 
If  the  undercount  yields  a  population  size  that  is  above  the 
base  year,  then  the  city  will  be  aided  in  the  growth-lag 
formula,  since  its  true  growth  rate  will  be  underestimated. 
Of  course,  it  will  be  aided  even  more  if  the  undercount  led 
to  a  figure  below  its  base-year  level.  In  that  case,  the  city 
would  be  reported  as  declining. 

If  the  city  is  declining,  an  undercount  in  the  current  year 
will  work  to  the  advantage  of  that  city  because  it  will  ex- 
aggerate its  growth  lag.  To  illustrate,  a  decline  means  that 
its  true  current  population  is  below  its  population  in  the 
base  period,  and  an  undercount  of  its  true  current  population 
means  a  figure  below  its  level  of  decline. 

If  the  city  is  stable,  an  undercount  will  show  this  city  as 
declining  and  therefore  increase  its  allocation  under  the 
growth-lag  formula. 

A  generalization  is  therefore  derived:  An  undercount  in 
one  period  followed  by  a  correct  count  in  the  next  will 
always  unfairly  reduce  the  amount  a  jurisdiction  received 
through  the  CDBG  growth-lag  formula  by  overestimating 
its  growth.  But,  a  correct  count  in  one  period  followed  by 
an  undercount  in  the  next  will  always  unfairly  increase  the 
amount  received  by  underestimating  growth.  This  holds 
whether  the  jurisdiction  is  growing,  stable,  or  declining. 
An  undercount  in  both  periods  leads  to  uncertain  results. 
(See  the  appendix.) 

While  this  certainly  implies  that  a  city  might  be  better 
off  being  undercounted  in  1980,  the  incentives  to  avoid 
this  undercount  are  great.  First,  the  benefits— even  using  the 
population  lag— are  only  the  marginal  differences  between 
using  the  population-lag  formula  and  the  population-size 
formula.  Second,  the  population-lag  variable,  while  it  might 
lead  to  a  higher  allocation,  is  only  one  of  three  variables  in 
the  equation.  All  three  variables  tend  to  be  highly  correlated 
with  population  growth;  hence,  if  a  city's  growth  rate  were 
underestimated  as  a  result  of  the  undercount,  the  other 
variables  will  dampen  the  effect,  since  they  also  tend  to  be 
strongly  correlated  with  growth.  For  example,  a  growth 
city  will  have  a  relatively  low  rate  of  poverty  and  a  relatively 
young  housing  stock,  and  these  tend  to  reduce  allocations.1 


'The  formula  contains  three  variables:  Growth  lag,  poverty,  and 
age  of  housing.  If  population  has  grown  rapidly  (but  underestimated 
by  the  undercount)  the  city  will  be  helped.  But  poverty  and  age  of 
housing  are  inversely  correlated  with  growth  and  positively  correlated 
with  allocations;  hence,  they  will  tend  to  offset  or  "dampen"  the 
effect  of  the  undercount  that  underestimated  growth. 


Fourth,  an  undercount  would  work  to  the  detriment  of  the 
city  in  many  other  population-based  formulas,  such  as 
revenue  sharing  and  CETA. 

An  adjustment  for  the  population  undercount  of  an 
entitlement  city  or  urban  county  would  lead  to  an  interstate 
redistribution  of  funds.  Recall  that  there  is  no  State  allot- 
ment as  in  the  case  of  general  revenue  sharing.  Thus,  the 
adjustment  necessary  to  compensate  a  jurisdiction  that  is 
undercounted  is  spread  over  many  other  recipients  and 
could  appear  as  a  minor  reduction  in  their  entitlement. 

Among  nonentitlement  cities  (those  funded  from  the 
discretionary  balances),  an  adjustment  would  also  lead  to  an 
interstate  redistribution,  since  their  figure  theoretically 
should  be  reflected  in  the  size  of  the  metropolitan  and 
nonmetropolitan  populations  used  in  the  formulas.  Unlike 
the  entitlement  case,  however,  the  incentive  for  urging  a 
correction  on  the  part  of  a  nonentitlement  city  is  low.  The 
reason  for  this  is  that  the  nonentitlement  pool,  while  de- 
termined on  the  basis  of  a  formula,  is  allocated  on  a  com- 
petitive basis.  There  is  no  assurance  that  the  undercounted 
jurisdiction  would  be  the  beneficiary  of  an  adjustment  that 
increased  the  size  of  the  State  allocation.  Furthermore,  since 
these  nonentitlement  cities  are  small  (below  50,000  in  popu- 
lation) and  because  the  population  variable  is  used  as  a 
relative  rather  than  an  absolute  figure,  the  likelihood  that  an 
undercount  in  any  one  of  these  cities  would  markedly  reduce 
the  State  discretionary  allotment  is  very  remote.  The  most 
likely  losers  and  gainers  from  an  incorrect  population  count 
are  therefore  entitlement  cities,  or  cities  that  would  have 
been  eligible  for  entitlement  if  an  undercount  had  not  pro- 
duced a  population  figure  of  less  than  50,000. 

The  CDBG  program  also  contains  one  element  that  could 
reduce  the  redistribution  of  funds,  gainers,  and  losers  as  a 
result  of  the  population  undercount.  The  Secretary  does 
have  a  discretionary  fund  that  provides  for,  among  other 
uses,  adjustments  resulting  from  inequities  due  to  the  allo- 
cation formula. 

The  Farmers  Home  Administration  Housing  Program 

The  Farmers  Home  Administration  Housing  Program  aims 
at  improving  the  quality  of  the  housing  stock  in  small  juris- 
dictions. It  began  through  the  Federal  Housing  Act  of  1949, 
which  authorized  home  loans  to  farmers.  Since  that  time, 
the  program  has  expanded  to  include  senior  citizens  and 
low- and  moderate-income  residents  in  towns  up  to  20,000. 
The  program  came  about  in  recognition  of  the  serious 
housing  conditions  and  the  shortage  of  mortgage  credit  in 
nonmetropolitan  America. 

For  purposes  of  this  discussion,  we  shall  concentrate  on 
title  502  housing  assistance  programs,  which  made  nearly 
$3  billion  in  loans  in  1978.  Under  this  title,  the  Farmers 
Home  Administration  is  authorized  by  Congress  to  make 
loans  for  the  purchase,  repair,  or  building  of  modest  low- 
to  moderate-income  homes.  This  program  provides  insured 
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loans  to  families  with  an  adjusted  income  up  to  $15,000 
and  guaranteed  loans  to  families  with  an  adjusted  income 
up  to  $20,000.  Under  the  guarantee  program,  interest  rates 
are  negotiated  between  a  commercial  lender  and  the  bor- 
rower, and  90  percent  of  the  principal  and  interest  are 
guaranteed  by  HUD.  Under  the  insured-loan  program, 
families  may  obtain  an  interest  subsidy  that  would  bring 
their  interest  from  the  commercial  lender  down  to  1  percent; 
the  maximum  interest  on  these  loans  is  10  percent. 

Adjusted  income  is  obtained  by  taking  the  total  income  of 
all  adults  expected  to  live  in  the  house;  5  percent  of  this  total 
and  $300  for  each  child  expected  to  reside  in  the  house  are 
then  subtracted  to  get  the  adjusted  figure. 

Residents  in  all  nonmetropolitan  areas  with  populations 
less  than  10,000  are  eligible  for  this  assistance.  However, 
residents  in  areas  over  10,000  but  not  greater  than  20,000 
may  be  eligible  if  there  is  a  shortage  of  credit  either  from 
governmental  or  private  sources  as  certified  by  the  Secre- 
taries of  Housing  and  Urban  Development  and  of  Agriculture. 
In  all  these  areas,  loans  are  restricted  to  low-  and  moderate- 
income  families  as  defined  by  their  adjusted  income. 

Allocation  formula.  The  funds  are  allocated  by  the 
Federal  Government  to  States  according  to  a  formula  that  is 
very  sensitive  to  specific  population  segments.  The  formula 
for  insured  502  loans  uses  the  State's  percentage  of  the 
national  population,  State's  percentage  of  national  rural 
population  living  in  inadequate  housing,  State's  percentage 
of  national  rural  poverty,  and  cost  of  housing  adjusted  for 
population.  Each  variable  in  this  formula  has  a  30-percent 
weight  except  the  cost,  which  has  10  percent.  The  State  di- 
rectors, in  turn,  allocate  their  funds  among  State  districts 
(a  composite  of  State  counties),  using  the  same  formula. 
The  districts  then  allocate  the  funds  to  their  constituent 
counties,  using  the  same  formula.  The  counties  make  awards 
to  eligible  citizens  within  their  jurisdictions. 

The  insured  loans  are  distributed  by  the  States,  using  a 
formula  that  is  slightly  different  from  that  used  to  distribute 
guaranteed  loans.  Instead  of  the  poverty  population,  it 
counts  middle-income  population,  i.e.,  the  State's  percentage 
of  national  rural  households  with  incomes  between  $15,000 
and  $20,000. 

Impact  of  the  undercount.  Even  though  this  formula  has 
a  heavy  population  content,  the  undercount  of  the  total 
population  has  little  to  do  with  fund  distribution.  Distribu- 
tion is  based  on  specific  segments  of  the  population  and, 
therefore,  to  suffer  a  loss,  a  jurisdiction  will  have  to  experi- 
ence an  undercount  in  that  segment.  Specifically,  the  seg- 
ments are  rural  population,  rural  population  living  in  in- 
adequate housing,  rural  population  below  the  poverty  level 
threshold,  rural  households  62  years  of  age  and  older,  and 
rural  households  with  incomes  between  $15,000  and 
$20,000.  An  undercount  affecting  these  specific  segments 
will  hurt,  but  we  know  little  about  the  accuracy  of  the 
counts  in  these  groups. 


It  should  be  noted  that  while  Federal  dollars  are  distri- 
buted to  States,  their  destinations  are  to  specific  classes 
of  individuals  in  specific  types  of  counties.  For  all  intents 
and  purposes,  then,  the  counties  or  the  towns  are  the 
jurisdictions  at  risk,  not  the  State. 

The  counties  at  risk  are  affected  by  the  State  rural  totals, 
since  these  determine  the  State  pool  from  which  the  counties 
draw.  A  higher  count  for  a  county  does  not  lead  initially  to 
a  redistribution  among  counties  in  its  district,  assuming  that 
the  correct  count  is  reflected  in  the  new  State  total.  But 
because  any  given  county  is  likely  to  be  a  very  small  part  of 
the  total  national  rural  or  even  State  rural  population,  ad- 
justing for  its  undercount  is  likely  to  lead  to  a  very  small 
redistribution  among  States.  The  added  dollars  could  be 
significant,  however,  in  terms  of  that  specific  county's 
allocation  or  needs. 

While  the  county  is  the  jurisdiction  at  risk,  it  is  not  the 
ultimate  loser.  Unlike  the  other  programs  described  in  this 
paper,  502  assistance  does  not  go  to  create  neighborhood  or 
community-wide  goods  or  services,  and  it  does  not  become 
part  of  the  general  funds  available  to  the  local  or  State  juris- 
dictions. The  beneficiaries  of  the  502  housing  program  are  a 
specific  class  of  eligible  residents  in  specific  types  of  counties 
or  towns.  Therefore,  the  loser  can  clearly  be  identified  as  a 
household  rather  than  a  jurisdiction. 

The  loss  to  such  a  household  in  a  specific  county  is 
limited  by  the  credit-carrying  capacity  of  that  household, 
since  the  amount  of  assistance  available  to  it  is  a  function 
of  that  factor.  The  loss  falls  upon  a  household  that  would 
otherwise  have  been  assisted  had  the  county  received  its  due 
share. 

EQUITY  CONSIDERATIONS 

The  argument  is  often  put  forward  that  a  correction  of 
one  jurisdiction's  count  will  lead  to  a  reallocation  of  funds, 
such  that  one  jurisdiction  can  only  be  made  better  off  at 
the  expense  of  another.  This  argument  is  sapped  of  its 
strength  when  we  consider  the  following: 

1.  At  the  outset  of  every  fiscal  year,  calculations  are  made 
anew.  At  that  point,  unless  protected  by  a  specific 
hold-harmless  clause,  the  specific  allocation  to  each 
jurisdiction  is  unknown  and  undetermined.  Thus,  there 
is  not  actually  a  taking  from  one  recipient  and  a  giving 
to  another. 

2.  Legislation  setting  up  an  assistance  program  carries  the 
express  intent  of  the  program,  and  population  is  con- 
sciously chosen  as  a  factor  or  a  proxy  to  represent  the 
problem  that  is  being  addressed.  Any  reallocation  that 
is  based  on  an  accurate  population  count  is  therefore 
an  improvement  over  one  that  is  based  on  an  inaccurate 
count. 

3.  To  the  extent  that  the  assumption  of  a  diminishing 
marginal  utility  of  a  dollar  is  correct,  a  reallocation  of  a 
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dollar  from  a  richer  to  a  poorer  recipient  represents  a 
movement  toward  a  general  social  improvement  or 
Pareto  optimality.  Consequently,  a  reallocation  that 
increases  the  dollar  received  by  a  poorer  recipient— the 
general  direction  of  intergovernmental  aid— even  by 
denying  that  dollar  to  richer  recipients  is  an  improve- 
ment in  social  welfare.  The  rich  need  a  dollar  less  than 
the  poor. 
4.  In  most  cases,  the  dollar  requirements  to  adjust  for  the 
undercount  of  any  recipient  is  spread  over  a  large  num- 
ber of  other  recipients;  for  each  of  these  other  recipi- 
ents, the  dollar  adjustment  is  likely  to  be  small  and 
lead  to  a  relatively  insignificant  change  in  its  allocation. 
To  the  losing  jurisdiction  or  household,  the  loss  could 
be  substantial  relative  to  its  need. 

Politics,  Economics,  and  Misconceptions 

There  are  several  considerations  that  reduce  the  vigor  with 
which  the  case  for  an  adjustment  is  pursued.  They  include 
politics,  economics,  and  misconceptions. 

First,  those  who  lose  because  of  an  undercount  are  often 
unaware  of  their  losses.  This  is  so  because  the  undercount 
is  an  intransitive  tax.  That  is,  we  do  not  literally  take  from 
a  recipient.  We  do  not  announce  how  much  each  should 
have  gotten.  Therefore,  persons  and  jurisdictions  alike  might 
not  fully  internalize  the  cost  to  them  of  an  undercount  and 
won't  pursue  adjustments. 

Second,  for  some,  the  cost  of  an  adjustment  is  higher  than 
the  perceived  benefits.  For  some  jurisdictions,  to  prove  and 
pursue  legal  and  administrative  actions  to  correct  an  under- 
count is  not  only  costly  but  risky,  since  there  is  no  assurance 
that  their  estimates  will  be  accepted  or,  if  they  are,  that  they 
will  work  consistently  or  sizably  to  the  jurisdiction's  ad- 
vantage. 

Third,  sometimes  it  is  not  to  the  political  or  economic 
advantage  of  a  larger  jurisdiction  (State  or  county)  of  which 
an  undercounted  jurisdiction  is  a  part  to  help  pursue  an 
adjustment.  In  the  case  of  revenue  sharing,  an  adjusted  count 
of  a  single  sub-State  jurisdiction  count  could  lead  more  to  an 
intrastate  redistribution  than  an  interstate  redistribution, 
that  is,  the  possibilities  that  jurisdictions  within  a  State  will 
bind  together  on  adjusting  the  count  of  a  single  jurisdiction 
are  diminished. 

Fourth,  where  the  losers  are  a  class  of  persons  or  juris- 
dictions, rather  than  an  identifiable  entity,  the  incentive  to 
pursue  an  adjustment  is  lessened,  since  no  single  member  of 
that  class  has  any  assurance  that  if  an  adjustment  is  made  it 
will  benefit.  This  is  the  case  of  the  CDBG  metropolitan  or 
nonmetropolitan  discretionary  funds,  where  State  alloca- 
tions are  based  on  population  but  intrastate  distribution  is 
based  on  competition  among  applicants.  It  is  also  true  of 
the    Farmers  Home  Administration   Housing  502  program. 

Fifth,  the  equity  question  is  frequently  clouded  by  as- 
suming that  it  is  rich  versus  poor.  Indeed,  this  vertical  equity 


is  often  the  case.  When  it  is,  the  welfare  argument  for  a 
readjustment  is  clear-cut  and  illustrated  by  the  numerous 
accounts  of  less  needy  areas  using  allocations  for  activities 
which  are  not  basic  and  are  questionable  parts  of  the  overall 
objective  of  the  legislation.  This  is  exemplified  by  the  argu- 
ments that  surround  many  government  programs.  There  is 
also  the  issue  of  horizontal  equity:  To  the  extent  that 
population  reflects  need,  an  undercount  means  that  two 
jurisdictions  that  are  fundamentally  similar  in  size  (and  to 
that  extent  in  need)  are  treated  differently  because  of  the 
undercount. 

Finally,  the  argument  is  frequently  made  that  there  is  an 
equity  issue  involving  correcting  only  the  population  variable 
when  other  variables  in  a  formula  also  have  errors.  It  appears 
that  an  incremental  step  toward  equity,  i.e.,  correcting  only 
the  population  data,  is  superior  to  standing  still. 

CONCLUSIONS 

This  paper  discusses  population  as  a  key  variable  in  the 
distribution  of  Federal  aid  to  State  and  local  governments 
and  in  the  distribution  of  State  aid  to  their  localities.  How 
the  undercount  affects  this  distribution  process  can  only  be 
ascertained  on  a  program-by-program  basis,  since  the  role 
that  population  plays  varies. 

Similarly,  the  impact  of  the  undercount  on  any  jurisdic- 
tion requires  a  detailed  analysis  of  the  mix  of  Federal  and 
State  assistance  that  the  jurisdiction  receives.  The  story  is 
told  that  the  city  of  Portland,  Oreg.,  having  discovered  that 
its  undercount  led  to  a  loss  in  general  revenue  sharing  funds, 
sought  to  correct  this,  only  to  realize  that  the  higher  popu- 
lation figure  would  have  severely  reduced  its  CDBG  entitle- 
ment, using  the  population-lag  formula.  To  have  used  a 
corrected  count  would  have  led  to  a  net  loss. 

In  this  vein,  it  has  been  shown  that  while  an  undercount 
will  reduce  the  entitlement  of  a  city  in  most  cases,  in  the 
particular  case  of  the  growth-lag  variable  used  in  the  CDBG 
program,  it  could  yield  a  higher  entitlement  than  the  city 
might  otherwise  have  received.  An  undercount  turns  out  to 
be  potentially  beneficial. 

Surely,  there  is  an  equity  question  associated  with  the 
redistribution  that  would  occur  as  a  result  of  a  correction 
in  the  count.  On  balance,  it  appears  to  this  author  that  the 
equity  argument  works  in  favor  of  making  the  adjustment. 
The  argument  is  simple:  A  distribution  based  on  an  error 
which  causes  us  to  move  away  from  the  original  intent 
takes  us  away  from  the  equity  state  originally  postulated 
as  desirable.  Any  correction,  therefore,  brings  us  closer  to 
the  desired  state  and  is  both  an  improvement  in  welfare  as 
it  is  in  equity. 

In  considering  the  equity  question,  an  error  is  often  made 
in  assuming  that  (owing  to  an  adjustment  of  the  count)  funds 
will  be  taken  from  one  jurisdiction  and  given  to  another. 
Actually,  few  localities  have  prior  claims  on  a  specific  dollar 
amount    of  a   program   appropriation.  They   are   normally 
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informed  of  their  legal  claims  after  the  allocation  procedure 
is  conducted  in  a  central  State  or  Federal  bureaucracy. 
Dollars  are  distributed  in  the  computation  process  only. 
Further,  it  is  illogical  to  conclude  that  one  has  lost  that 
which  one  should  never  have  received. 

Concerning  the  dollar  amount,  the  few  studies  that  have 
been  done  indicate  that  the  dollar  gain  or  loss  is  generally 
below  5  percent.  It  might  be  legitimate  to  consider  that  this 
quantity  is  not  worth  fighting  for  or  to  conclude  that  it  is 
too  costly  to  try  to  achieve.  But  these  are  economic— not 
equity— considerations.  The  equity  consideration  would  ap- 
propriately relate  to  the  utility  (not  the  amount)  of  the 
dollars  in  a  needy  city  compared  to  the  utility  in  a  city 
which  is  not  needy  and  is  obtaining  more  than  was  intended. 
Furthermore,  a  5-percent  loss  over  several  years  of  a  major 
program  is  by  no  means  peanuts. 

This  paper  has  also  tried  to  identify  losers  and  gainers 
resulting  from  an  undercount  and  its  correction.  Sometimes 
these  might  be  persons  or  jurisdictions.  On  other  occasions 
the  losers  might  be  a  class  of  persons  or  a  class  of  juris- 
dictions. Because  these  programs  have  effects  that  go  beyond 
the  individual  beneficiary,  the  consequence  of  an  undercount 
is  shared  by  a  wide  public. 

Hence,  while  we  have  clearly  identified  specific  groups 
or  entities  as  losers,  it  should  be  noted  that  since  most 
government  programs  produce  externalities— that  is,  the 
benefits  are  enjoyed  by  persons  or  entities  other  than  the 
recipients— the  ultimate  losers  are  frequently  more  than  the 
single  person  or  entity.  The  Farmers  Home  Administration 
Program  is  aimed  essentially  at  a  low-  or  moderate-income 
household,  but  to  the  extent  that  those  funds  are  used  to 
substantially  improve  a  house,  they  benefit  the  neighborhood 
in  which  the  house  exists;  to  the  extent  that  they  help  in 
providing  credit  to  a  small  town,  they  give  liquidity  to  the 
housing  market  in  that  town.  Similarly,  the  CDBG  assistance, 
which  increasingly  benefits  low- and  moderate-income 
families  (accounting  for  roughly  70  percent  of  beneficiaries) 
leads  to  improvements  in  neighborhoods  and  therefore  to 
people  in  the  neighborhoods  and,  indirectly,  an  improve- 
ment in  the  city. 

Finally,  while  this  paper  has  concentrated  on  population 
as  a  formula  or  eligibility  factor,  it  is  obvious  that  there  is  a 
relationship  between  dollars  and  votes  at  all  levels  of  govern- 
ment. 

APPENDIX 

EFFECTS  OF  UNDERCOUNT  ON 
GROWTH-LAG  VARIABLE 

Turn  to  the  diagram  on  the  following  page.  Let  line  (a) 
represent  the  correct  population  point  in  the  base  year.  An 
undercount  means  that  the  population  reported  by  the 
census  is  below  (a),  which  is  represented  by  point  (b).  If,  in 
the  current  year,  population  is  correctly  enumerated  for  a 


growing  city,  this  means  that  the  true  current  population 
size  (c)  is  above  (a).  The  distance  from  (b)  to  (c)  is  greater 
than  the  distance  from  (a)  to  (c),  and  the  difference  rep- 
resents the  overestimate  of  growth. 

Graphically,  if  the  city  is  declining,  then  the  current  popu- 
lation (c)  is  below  point  (a).  If  the  decline  is  such  that  the 
current  population  is  above  the  point  of  the  undercount  (b), 
then  it  can  be  represented  by  (Ci ).  If  the  population  decline 
was  large  and  below  the  point  of  the  undercount,  it  could  be 
represented  by  point  (c2).  If  the  current  population  level  is 
(c! ),  then  a  growth  is  reported,  since  (ci )  >  (b).  But  in 
reality  a  decline  was  experienced,  since  (ci)<{a).  If  the 
current  population  is  (C2 ),  then  a  decline  is  reported,  but 
one  that  is  substantially  lower  than  what  actually  occurred, 
since  the  distance  from  (a)  to  (c2  )  >  (b)  to  (c2  ). 

The  true  population  in  both  years  is  equal  if  the  city  is 
stable  and  (a)  =  (c).  Since  there  was  an  undercount  in  the 
base  year,  represented  by  (b),  a  growth  is  reported  for  the 
city  even  though  it  is  stable.  The  growth  is  represented  by 
the  distance  from  (b)  to  (c). 

If  a  city  had  its  population  in  the  base  year  accurately 
reported,  this  population  level  could  be  represented  by  point 
(a).  If  the  city  has  grown,  its  true  current  population  would 
be  some  level  above  (a)  and  could  be  represented  by  point 
(c).  An  undercount  of  the  current  population  which  is  above 
the  base  year  could  be  represented  by  point  (b! ).  In  that 
case,  the  city  is  helped,  since  its  true  growth  has  been  under- 
estimated. That  is,  the  distance  from  (a)  to  (c)  >  (a)  to 
(bj  ).  If,  however,  the  undercount  was  so  severe  that  it  re- 
ported a  population  figure  (b2 ),  the  city  would  be  helped 
even  more,  since  the  city  would  be  reported  as  having  de- 
clined, (b2)  <  (a). 

If  a  city  is  declining  and  its  population  in  the  base  period 
was  accurately  reported  as  (a),  its  current  population  will  be 
some  point  (c),  which  is  below  (a).  If,  however,  the  current 
population  is  undercounted,  then,  by  definition,  the  under- 
count (b)  is  below  (c)  and  the  city  will  be  helped,  since  its 
decline  would  have  been  exaggerated— that  is,  a  decline  from 
(a)  to  (b)>  (a)  to  (c). 

If  the  population  of  the  city  is  stable  and  the  base  period 
count  is  accurate,  then  (a)  =  (c).  An  undercount  means  that  a 
population  below  (a)  will  be  reported  at  some  level  such  as 
(b).  Hence,  the  city  will  be  helped  because  it  will  be  shown 
as  having  declined  even  though  it  has  not.  Note  that  (a)  = 
(c)>(b). 

The  situation  of  a  jurisdiction  that  has  been  undercounted 
in  both  periods  is  a  little  more  complex.  The  examples  shown 
here  indicate  that  the  relative  size  of  the  undercount  and 
whether  the  city  is  growing,  declining,  or  stable  are  impor- 
tant. 

Let  us  take  a  jurisdiction  that  was  undercounted  in  both 
periods  but  grew  between  periods.  Its  accurate  count  can  be 
designated  by  point  (a).  An  undercount  in  the  base  period 
can  be  represented  by  some  point  (b).  If  it  grew  between 
periods,  this  growth  can  be  represented   by  point  (c).  An 


Diagram  Showing  Effects  of  Undercount 


A.   Base  Year  Count  Underreported,  Current  Count  Correct 
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undercount  in  the  second  period  can  be  represented  by 
point  (di),  (d2),  (d3),  (d4),  or  (d5).  In  the  case  of  (d^  ), 
an  undercount  which  is  very  slight  and  close  to  the  true 
current  population,  the  jurisdiction  will  be  hurt,  since  its 
growth  will  be  exaggerated- the  distance  from  (b)  to  (di )  > 

(a)  to  (c). 

If  the  undercount  is  represented  by  (d2),  the  census  will 
register  a  growth,  (b)  to  (d2).  In  this  case,  the  jurisdications 
will  be  hurt  if  the  distance  from  (b)  to  (d2 )  <  (a)  to  (c), 
growth   is  underestimated  and  the  jurisdiction  is  helped.  If 

(b)  to  (d2 )  =  (a)  to  (c),  the  undercount  has  no  effect. 

If  the  undercount  is  represented  by  (d3),  the  jurisdiction's 
growth  may  be  underestimated  and  it  is  helped  if,  and  only 
if,  (b)  to  (d3)  <  (a)  to  (c).  If  the  undercount  is  represented 
by  (d4),  it  would  register  the  jurisdiction  as  having  received 
zero  growth  and  assist  it.  If  the  undercount  is  (ds),  then 
the  jurisdiction  is  helped  because  it  is  registered  as  having 
declined  rather  than  grown. 

Finally,  we  can  look  at  a  jurisdiction  that  has  declined 
in  population  between  the  first  and  second  periods  and  has 
experienced  an  undercount  in  each  period.  Assume  that 
(a)  is  the  correct  count  in  the  first  period  and  (b)  is  the 
undercount  in  the  first  period.  If  the  population  is  declining, 
it  could  be  represented  by  (C]  ),  (c2),  or  (c3).  An  undercount 
can  be  represented  by  (d] ),  (d2 ),  (d3 ),  or  (d4  ).  If  the  correct 
population  size  in  the  second  period  is  (ci),  and  the  under- 
count in  that  period  is  (dj,  the  jurisdiction  will  be  hurt, 
since  the  census  will  register  it  as  growing  rather  than  de- 
clining. If  the  undercount  in  the  second  period  is  (d2),  the 
census  will  register  a  decline,  i.e.,  (b)  to  (d2 ).  The  jurisdiction 
is  helped  if  (a)  to  (ci )  <  (b)  to  (d2 ).  The  effect  is  zero  if  (a) 
to  (ci)  =  (b)  to  (d2).  It  is  hurt  if  (a)  to  (c! )  >  (b)  to  (d2).  It 
is  also  hurt  if  the  undercount  is  represented  by  (d3),  be- 
cause this  will  represent  zero  growth  rather  than  a  decline. 

Suppose,  however,  that  the  correct  count  in  the  second 
period  was  represented  by  (c2 )  and  the  undercount  in  this 
period  by  (d4).  In  this  case,  the  jurisdiction  is  helped  if 
(a)  to  (c2)  <  (b)  to  (d4).  The  undercount  will  have  no  effect 
if  (a)  to  (c2)  =  (a)  to  (d4).  It  will  be  injured  if  (a)  to  (c2 )  > 
(b)to(d4). 

In  the  case  of  a  stable  jurisdiction  that  has  not  grown 
between  the  two  periods,  its  true  population  in  both  periods 
will  be  the  same,  i.e.,  (a)  =  (c).  If  the  undercount  is  repre- 
sented by  (b)  and  the  true  population  by  (a)  in  the  first 
period,  then  an  undercount  in  the  second  period  could  be 
represented  either  by  (di )  or  (d2).  In  the  case  of  (dx ),  the 
jurisdiction  is  hurt  because  a  growth  rather  than  a  stable 
population  is  registered  by  the  census.  In  the  case  of  (d2), 
a  decline  is  registered  and  the  jurisdiction  is  helped. 

Summary 

These  results  indicate  that  if  a  city  has  an  undercount  in 
the  base  year  and  a  correct  count  in  the  current  year,  it  will 
invariably  be  injured   by  a   population  growth-lag  variable; 


if  it  had  a  correct  count  in  the  base  period  and  an  undercount 
in  the  current  period  it  will  be  helped.  These  results  hold 
whether  cities  are  growing,  declining,  or  stable.  If  there  are 
undercounts  in  both  periods,  the  results  are  uncertain.  A 
city  could  gain,  lose,  or  experience  no  impact  whatsoever 
regardless  of  whether  the  city  is  growing,  declining,  or 
stable.  In  general,  a  very  large  undercount  would  be  required 
in  the  current  period  if  a  jurisdiction  which  was  under- 
counted  in  the  first  period  is  to  be  helped.  Such  a  large 
undercount  would  certainly  hurt  in  every  other  program, 
such  as  CETA  and  revenue  sharing. 
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Comments 


Wray  Smith 

U.S.  Department  of  Health,  Education,  and  Welfare 


While  the  Slater  paper  indicated  that  the  effect  of  the 
undercount  can  and  should  be  assessed,  at  least  for  some 
programs,  it  perhaps  should  be  of  secondary  concern.  Some 
of  the  basic  decisions  to  be  faced  by  the  Census  Bureau  will 
have  to  be  made  without  the  results  of  the  studies  suggested, 
and  those  studies  are  likely  not  to  be  at  all  conclusive.  Thus, 
some  decisions  must  be  made  about  adjustment,  taking  into 
account  either  known  effects  or  those  that  can  be  reasonably 
speculated  about,  but  there  will  not  be  definitive  informa- 
tion to  show  that  the  decision  made  was  exactly  the  right 
one. 

It  is  important  to  make  a  statistical  decision  on  adjust- 
ment soon  and  as  cleanly  as  possible  with  what  is  already 
known  and  with  what  collective  views  as  can  be  used  to 
construct  a  more  informed  decision.  The  decision  also  should 
be  based  on  a  concern  for  the  gainers  and  losers;  the  designs 
for  new  programs  revolve  around  cost  and  coverage— who 
gains  and  who  loses.  Program  administrators,  program  plan- 
ners, and  the  Congress  will  be  inventive  enough  to  modify 
formulas  based  on  the  adjustment. 

Although  the  Bryce  paper  would  have  been  strengthened 
by  the  use  of  some  live  data,  the  paper  gives  a  very  balanced 
view  of  an  adjustment  process— in  any  process  there  will  be 
gainers  and  losers— and  the  determination  of  that  is  a  very 
complex  process.  The  point  of  the  boundaries  on  the  effects 
of  an  adjustment,  that  is,  the  cushioning  effect  on  any 
adjustment  built  into  a  formula  with  upper  and  lower 
constraints,  is  very  important.  In  those  formula  programs 
that  do  not  have  cushioning  factors  built  in,  one  might  wish 
to  build  in  transition  arrangements  to  cushion  the  impact  of 
an  adjustment  on  a  major  allocation  program.  Sensitivity  to 


the  undercount  varies  between  different  formula  programs 
and  must  be  analyzed  in  any  adjustment  process. 

In  deciding  whether  to  support  an  adjustment,  com- 
munities are  in  a  risk-averse  situation.  That  is,  they  would  be 
more  likely  to  favor  adjustment,  if  it  could  be  done  without 
exposing  themselves  to  the  jeopardy  of  losing  because  of  an 
adjustment.  All  things  considered,  however,  there  definitely 
should  be  an  adjustment,  except  in  those  cases  where  the 
cost  of  a  very  small  adjustment  is  too  great. 

The  effect  of  the  compounding  over  time  of  an  error  on 
the  allocated  amount  also  is  an  important  consideration.  Mr. 
Bryce  said  that  a  reality  in  adjustment  is  that  what  might 
appear  to  be  a  small  cost  to  a  community  of  an  undercount, 
would  become  significant  over  time.  In  an  entitlement 
process,  what  is  distributed  is  really  a  notification  of  an 
entitlement,  not  the  actual  money,  and  the  community  may 
have  to  make  application  to  get  the  money  it  is  entitled  to. 
The  community  may  not  realize  what  it  gains  or  loses  until  it 
is  explained. 

There  are  three  distinguishable  types  of  losers  and  gainers. 
The  first  is  in  the  case  of  an  identifiable  individual  person 
who  loses  or  gains,  for  example,  in  the  Farmers  Home 
Administration  502  program.  In  the  general  revenue  shar- 
ing program,  the  losers  or  gainers  are  identifiable  juris- 
dictions. In  the  Community  Development  Block  Grant 
Program,  which  has  a  discretionary  fund,  there  are  no 
identifiable  losers  or  gainers  because  of  the  way  the  program 
is  set  up.  In  talking  about  losers,  it  may  not  be  possible  to 
identify  individuals,  and  the  applicant  could  not  know 
whether  it  would  actually  win  if  its  figures  were  adjusted  for 
undercount. 
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FloorDiscussion 


It  was  observed  that  although  most  discussion  focused  on 
the  general  revenue  sharing  program  and  the  discussion  was 
in  terms  of  a  zero  sum,  other  programs  are  not  so 
competitive  because  they  are  not  zero  sum.  In  fact,  it  was 
suggested  that  the  fixed  pie  in  revenue  sharing  should  not 
necessarily  be  fixed.  If  a  large  number  of  persons  were  added 
to  the  national  total,  perhaps  the  total  amount  of  money 
allocated  should  be  adjusted.  It  is  also  not  inconceivable  that 
the  formula  would  be  changed  after  an  adjustment  was  made. 

Discussion  followed  concerning  what  action  was  likely  if 
Congress  attempted  to  take  into  account  the  impact  of  the 
undercount.  The  question  of  a  transitional  fund  arose,  as  well 
as  the  question  of  holding  communities  harmless.  Con- 
gressional debate  might  consider  an  adjustment,  as  long  as 
those  communities  that  would  otherwise  have  received  a 
certain  number  of  dollars  would  be  held  harmless.  This  was 
felt  to  be  a  likely  outcome,  since  it  reduces  the  amount  of 
acrimony  and  would  in  the  long  run  reduce  the  amount  of 
adjustment  made.  The  problem  is  that  the  debate  is  not  only 
about  funds,  but  also  about  congressional  districts  and  power. 

It  was  agreed  that  it  is  virtually  impossible  for  a 
jurisdiction  to  compute  the  differential  impact  of  an  adjust- 
ment in  the  many  formula  programs.  The  formulas  are  far 
too  complex  and  there  are  insufficient  data  available  to  make 


an  estimate  of  impact.  Under  the  Community  Block  Grant 
Program,  for  example,  the  city  is  one  of  the  communities 
benefited  by  the  growth-lag  formula.  At  the  same  time, 
however,  a  number  of  new  standard  metropolitan  statistical 
areas  had  to  share  in  the  same  total  amount,  and  the  real 
impact  of  the  change  is  unknown.  Similarly,  if  all  of  the 
elements  of  a  formula  were  simultaneously  updated,  there 
would  be  no  means  to  compute  the  effect  of  each  factor.  It 
was  concluded  that  the  elements  in  formulas  and  the 
regulations  concerning  them  are  far  too  complicated;  they 
should  be  simplified  and  made  more  uniform  in  their  impact 
on  cities.  It  was  noted,  however,  that  the  Office  of  Federal 
Statistical  Policy  and  Standards  has  added  a  staff  member 
whose  only  responsibility  will  be  to  examine  Federal 
allocation  formulas.  This  person  will  work  with  the  Congress 
and  in  coordinating  agency  work  on  formulas. 

Finally,  concern  was  expressed  as  to  the  possible 
compounding  effect  of  adjustments.  For  example,  if  there 
were  a  5-percent  error  that  was  not  constrained  by  the 
boundaries,  it  would  double  within  14  years,  and  within  20 
years,  increase  to  165  percent.  But  that  assumes  a  linear 
relationship  (with  no  feedback  from  lost  funds)  on  attri- 
butes of  the  population  prone  to  undercount,  which  might 
further  increase  the  error. 
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INTRODUCTION 

The  issue  of  an  undercount  has  plagued  every  decennial 
census  since  its  inception.  In  fact,  for  the  first  census,  the 
Government  mandated  that  minorities  be  undercounted!  In 
that  census  of  1 790,  only  three  slaves  were  to  be  counted  for 
every  five  whites  for  purposes  of  political  apportionment. 
Thus,  from  the  outset,  40  percent  of  the  black  population 
was  left  out  of  the  census.1  Since  then,  blacks  have 
continued  to  be  disproportionately  undercounted  in  decen- 
nial censuses— although  not  always  as  a  result  of  overt 
governmental  mandate. 

In  the  census  of  1870,  for  example,  the  Census  Bureau 
estimated  that  blacks  accounted  for  41  percent  (or  512,163) 
of  the  total  1,260,078  persons  not  counted.  Or,  while  1.9 
percent  of  the  white  population  was  missed  in  1870,  9.5 
percent  of  the  black  population  was  left  out  [15].  Appar- 
ently, some  of  the  highest  undercount  rates  for  blacks 
occurred  in  the  censuses  conducted  between  1920  and  1940. 
While  12.5  percent  and  12.7  percent  of  blacks  were  not 
counted  in  the  1930  and  1940  censuses,  respectively,  15.2 
percent  of  all  blacks  were  estimated  to  have  been  missed  in 
the  census  of  1920.  But  by  1960  and  1970,  the  Census 
Bureau  estimated  that  the  black  undercount  rate  fell  to  8.0 
percent  and  7.7  percent,  respectively  [15] . 

Although  the  magnitude  of  the  undercount  for  other 
minorities,  such  as  Hispanics  and  Native  Americans,  has  not 
yet  been  systematically  specified,  the  evidence  available 
strongly  suggests  that  they  have  been  disproportionately 
undercounted  as  well.  But  the  issue  of  the  census  undercount 
is  not  merely  of  academic  or  technical  importance;  it  has 
serious  social,  economic,  political,  and  legal  implications  as 
well  [1]. 

First  of  all,  a  disproportionate  undercount  of  subgroups 
severely  flaws  the  accuracy  and  adequacy  of  social  planning 
efforts,  especially  needs  assessments,  population  projections, 
and  the  distribution  of  social  and  economic  programs  and 
services  to  those  groups  [4] . 

Secondly,  since  population  figures  are  used,  in  part,  by 


'While  40  percent  of  black  slaves  were  omitted  from  the 
population  counts  used  for  allocating  seats  for  each  State  in  the 
House  of  Representatives,  all  of  them  were  included  in  the  total 
population  counts  in  the  censuses  conducted  between  1  790  and  1  860. 
Thus,  two  census  counts  were  maintained:  A  total  population  count 
and  an  "adjusted"  count  for  purposes  of  political  apportionment.  See 
Hyman  Alterman,  Counting  People:  The  Census  in  History,  New 
York:  Harcourt,  Brace,  and  World,  1969,  especially  chapter  7,  "The 
Negro  in  the  American  Census,"  pp.  262-290. 


over  100  Federal  programs  to  allocate  billions  of  dollars  each 
year  in  such  areas  as  education  (title  1,  free  lunches), 
employment  (Comprehensive  Employment  and  Training 
Act),  housing  (Community  Development  Block  Grants), 
economic  development  (Economic  Development  Admini- 
stration), social  services  (title  XX,  child  abuse,  elderly),  crime 
(Law  Enforcement  Assistance  Administration),  health,  trans- 
portation, and  revenue  sharing,  States  and  localities  with 
disproportionate  undercounts  are  deprived  of  their  equitable 
share  of  these  Federal  grants-in-aid  [17] . 

Third,  since  the  census  count  is  used  as  the  basis  for 
allocating  representation,  not  only  in  the  House  of  Repre- 
sentatives, but  in  State  and  local  policymaking  bodies  as  well, 
States  and  local  areas  with  disproportionate  undercounts  are 
also  deprived  of  their  equitable  share  of  political  represen- 
tation at  all  levels  of  government  [16] . 

Fourth,  the  deprivation  of  an  equitable  share  of  political 
representation  and  financial  aid  to  States  and  localities,  as 
well  as  to  subgroups,  with  disproportionate  undercounts 
raises  significant  constitutional  and  legal  questions  as  well.  In 
fact,  one  of  the  resolutions  adopted  at  a  1967  conference  on 
the  census  undercount  (sponsored  by  the  M.I.T.-Harvard 
University  Joint  Center  for  Urban  Studies)  effectively  sets 
forth  the  constitutional  issue: 


We  believe  that  what,  initially  at  least,  were  technical 
problems  have  by  their  very  magnitude  been  transformed 
into  social  problems  with  powerful  legal  and  ethical 
implications.  Specifically,  we  hold  that  where  a  group 
defined  by  racial  or  ethnic  terms  and  concentrated  in 
specific,  political  jurisdictions,  is  significantly  under- 
counted  in  relation  to  other  groups,  then  individual 
members  of  that  group  are  thereby  deprived  of  the 
constitutional  right  to  equal  representation  in  the  House 
of  Representatives  and,  by  inference,  in  other  legislative 
bodies.  Further,  we  hold  that  individual  members  of  such 
a  group  are  thereby  deprived  of  their  right  to  equal 
protection  of  the  laws  as  provided  by  Section  1  of  the 
14th  Amendment  to  the  Constitution  in  that  they  are 
deprived  of  their  entitlement  to  partake  in  Federal  and 
other  programs  designed  for  areas  and  populations  with 
their  characteristics. 

Injury,  while  general,  is  real;  redress  is  in  order.  This 
would  seem  a  matter  of  special  concern  to  the  Nation  in 
view  of  recent  Supreme  Court  rulings  establishing  the 
"one-man-one-vote"  principle  in  apportioning  legislatures 
and  in  view  of  the  extensive  Congressional  activity  in  the 
establishment  of  programs  designed  to  improve  the 
economic  and  social  status  of  just  those  groups  that 
appear  to  be  substantially  underrepresented  in  our  current 
population  statistics  [1] . 
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Fifth,  and  finally,  a  disproportionate  undercount  of 
minorities  hurts  nonminorities  as  well.  Since  areas  with 
disproportionate  undercounts  of  minority  groups  are 
deprived  of  their  equitable  share  of  Federal  financial  assist- 
ance, those  States  and  localities  often  have  to  place  greater 
tax  burdens  on  all  residents. 

Thus,  the  census  undercount  has  significant  social, 
economic,  political,  and  legal  ramifications.  And  the 
inequities  to  local  areas  as  well  as  to  population  groups  with 
disproportionate  undercounts  require  immediate  redress.  But 
what  actions  can  be  taken  to  reduce  the  inequities  to  areas 
and  groups  as  a  result  of  the  census  undercount? 

There  is  widespread  agreement  that  some  adjustment  of 
the  population  figures  for  States  and  local  areas  to  correct 
for  the  census  undercount  is  desirable.  But  there  is  little 
consensus  regarding  such  related  issues  as: 

(a)  What  methods  can  be  used  to  correct  for  the  census 
undercount  for  States  and  local  areas— the  synthetic, 
demographic,  or  matching  method? 

(b)  Which  method  is  most  feasible  and  reliable  for 
adjusting  for  the  undercount  for  localities? 

(c)  Should  adjusted  population  figures  be  used  for 
purposes  of  political  apportionment  as  well  as  for 
financial  allocations  to  States  and  localities? 

This  paper  will  attempt  to  address  these  questions  by 
assessing  the  comparative  strengths  and  weaknesses  of  the 
synthetic  method  for  adjusting  the  census  undercount  for 
States  and  local  areas. 

The  second  section  of  this  paper  will  briefly  describe  the 
synthetic  method  and  its  basic  assumptions,  while  the  third 
section  will  provide  an  overview  of  research  studies  that  have 
used  the  synthetic  method.  In  the  fourth  section,  the 
comparative  advantages  and  disadvantages  of  the  synthetic 
method  will  be  assessed  according  to  various  criteria:  Internal 
consistency,  simplicity,  timeliness,  flexibility,  equity,  and 
reliability.  The  concluding  section  will  propose  specific 
recommendations  for  using  the  synthetic  method  to  adjust 
for  the  census  undercount  for  States  and  local  areas. 

THE  SYNTHETIC  METHOD 

The  synthetic  method  is  a  statistical  procedure  for 
distributing  the  undercount  of  a  larger  geographical  area 
(such  as  the  Nation,  State,  county,  or  city)  among  its 
subunits  (such  as  a  State,  county,  city,  or  congressional 
district,  respectively).  For  example,  this  method  permits  one 
to  distribute  the  total  5.3  million  persons  that  the  Census 
Bureau  estimated  had  been  left  out  of  the  1970  census  not 
only  among  all  50  States,  but  also  among  every  subdivision 
(such  as  standard  metropolitan  statistical  areas,  counties, 
cities,  towns,  congressional  districts,  wards,  neighborhoods, 
census  tracts,  and  planning  areas)  within  each  State.  Similarly 
the  synthetic  method  allows  one  to  distribute  the  1.9  million 


blacks  and  3.4  million  whites  that  were  not  included  in  the 
official  1970  census  count  throughout  all  geographical 
subdivisions  below  the  national  level. 

The  Null  Hypothesis 

The  synthetic  method  requires  only  one  basic  assumption: 
the  null  hypothesis  (i.e.,  the  assumption  of  "no"  difference). 
The  null  hypothesis  is  a  time-honored  and  widely-accepted 
practice  in  statistics  which  assumes  that  the  difference 
between  means  estimated  for  samples  (or  subunits)  are  not 
statistically  different  from  the  mean  estimated  for  the 
universe  (or  the  total  population).  With  regard  to  the 
undercount,  the  null  hypothesis  assumes  that  the  estimates 
of  the  undercount  for  specific  race/sex/age  groups  in  various 
subunits  below  the  national  level  (such  as  States,  counties, 
and  cities)  are  not  statistically  different  from  the  undercount 
estimates  for  the  same  race/sex/age  groups  at  the  national 
level.2 

For  example,  the  Census  Bureau  estimated  that  the 
national  undercount  rates  for  black  and  white  males  35  to  39 
years  old  were  17.8  percent  and  4.1  percent,  respectively. 
Based  on  the  null  hypothesis,  the  synthetic  method  assumes 
that  the  undercount  rates  for  black  and  white  males  35  to  39 
years  old  at  units  below  the  national  level  will  not  be 
statistically  different  from  their  undercount  rates  at  the 
national  level.  More  specifically,  the  synthetic  method 
assumes  that  white  males  ages  35  to  39  will  be  undercounted 
at  a  rate  of  4.1  percent  in  all  subnational  units,  while  black 
males  ages  35  to  39  will  be  undercounted  at  a  rate  of  17.8 
percent  in  all  subnational  localities  (see  table  1 ). 

It  is  important  to  note  that  the  null  hypothesis  is  most 
often  used  by  statisticians  in  situations  where  one  does  not 
have  a  reliable  basis  for  inferring  the  magnitude  or  direction 
of  differences  between  the  means  of  samples  or  subunits.  In 
this  instance,  we  have  no  basis  for  knowing  whether,  for 
example,  black  males  ages  35  to  39  are  undercounted  in 
Birmingham,  Ala.,  or  Detroit,  Mich.,  at  a  rate  higher  or  lower 
than  the  17.8  percent  rate  for  black  men  in  this  age  category 
at  the  national  level.  We  are  constrained,  therefore,  to  accept 
the  null  hypothesis:  The  undercount  rates  for  black  males 
ages  35  to  39  in  Birmingham  and  in  Detroit  are  not 
statistically  different  from  the  undercount  rate  for  black  men 
in  that  age  group  at  the  national  level. 

Another  way  to  make  clear  the  basic  assumptions  of  the 
synthetic  method  is  to  be  explicit  about  what  this  method 
does  not  assume: 

(a)   First,   the  synthetic  method  does  not  assume  that  all 


2 See  description  of  null  hypothesis  in  any  standard  text  on 
sampling  statistics.  However,  since  a  null  hypothesis  traditionally 
refers  to  hypotheses  that  can  be  independently  tested.  Professor 
Harry  Hoberts,  University  of  Chicago,  suggests  that  it  would  be  more 
appropriate  to  refer  to  the  assumptions  underlying  our  synthetic 
method  as  a  maintained  "hypothesis." 
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Table  1.  Estimates  of  Net  Undercount  Rates,  by  Race, 
Sex,  and  Age:  1970 


Table  2.  Estimates  of  Percentage  Increase  Necessary  to 
Adjust  for  Undercount,  by  Race,  Sex,  and  Age:    1970 


(Percent) 

Age 

Ma 

es 

Females 

Males 

:emales 

White 

Black1 

White 

Black1 

Age 

White 

Black1 

White 

Black1 

Under  5 
5-9 
10-14 
15-19 

2.4 
2.5 
1.1 
1.3 

11.5 

ai 

3.6 
3.2 

2.0 
2.2 
0.9 
0.5 

10.7 
7.0 

Total 

2.4 

8.9 

1.4 

4.9 

2.8 
2.1 

Under  5 
5-9 
10-14 
15-19 

2.3 
2.4 
1.1 
1.3 

10.3 
7.5 
3.5 
3.1 

2.0 
2.2 
0.9 
0.5 

9.7 
6.6 
2.7 
2.1 

20-24 
25-29 
30-34 
35-39 

2.6 
4.9 
4.2 
4.3 

9.5 
18.5 
16.8 
21.7 

1.1 
2.9 
2.0 
0.8 

3.5 
7.1 
3.8 
4.8 

20-24 
25-29 
30-34 
35-39 

2.5 
4.7 
4.0 
4.1 

8.7 
15.6 
14.4 
17.8 

1.1 
2.8 
2.0 
0.8 

3.4 
6.6 
3.7 
4.6 

40-44 
45-49 
50-54 
55-59 

3.3 
3.6 
1.8 
2.1 

19.5 
15.3 
11.2 
11.9 

0.1 

0.5 

-0.3 

1.3 

3.F 
5.o 
4.0 
8.0 

40-44 
45-49 
50-54 

3.2 
3.5 
1.8 

16.3 
13.3 
10.1 

0.1 

0.5 

-0.3 

3.5 
5.0 
3.8 

60-64 
65-69 
70-74 

2.4 
-0.2 
-0.1 

7.9 
-6.3 
-0.7 

2.8 
-1.1 
0.4 

5.9 

-10.5 

6.2 

55-59 
60-64 

2.1 

2.3 

-0.2 

-0.1 

10.6 

7.3 

-6.7 

-0.7 

1.3 
2.7 

-1.1 
0.4 

7.4 

5.6 

-11.7 

5.8 

75  and 
over 

3.7 

0.3 

6.3 

19.8 

65-69 
70-74 

1  These 

percentage  increases 

are  for 

blacks  and  other 

nonwhite 

75  and 

races. 

over 

3.6 

0.3 

5.9 

16.5 

Source: 

Derived    by    the 

National 

Urban     League 

Research 

Note:  These  rates  refer  to  the  adjusted  percent  undercounts  in 
Set  D  presented  in  tables  4  and  5  of  Jacob  Siegel's  paper,  cited 
below. 

1  These  rates  are  for  blacks  and  other  nonwhite  races. 

Source:  Prepared  by  the  National  Urban  League  Research 
Department  from  data  in  Jacob  Siegel,  "Estimates  of  Coverage  of 
the  Total  Population  by  Sex,  Race,  and  Age  in  the  1970  Census," 
U.S.  Census  Bureau,  1973;  and  1970  General  Population 
Characteristics. 

blacks  (or  all  whites,  for  that  matter)  are  undercounted 
to  the  same  extent  in  every  State,  county  or  city.  On  the 
contrary,  since  the  national  undercount  rates  vary  among 
blacks  by  sex  and  age,  the  synthetic  method  also  assumes 
that  the  total  undercount  rates  for  blacks  in  different 
States  and  localities  will  vary,  depending  upon  the  sex 
and  age  distributions  of  the  black  populations  in  those 
localities.  This  is  why  the  synthetic  method  yields 
different  undercount  rates  for  blacks  in  different  States 
(e.g.,  Vermont,  7.9  percent;  Maine,  7.6  percent;  New 
York,  7.2  percent;  South  Carolina,  6.6  percent;  Missis- 
sippi, 6.3  percent)  and  cities  (e.g.,  New  York,  7.2 
percent;  Philadelphia,  6.9  percent;  New  Orleans,  6.7 
percent;  Charleston,  6.5  percent), 
(b)  Second,  the  synthetic  method  does  not  assume  that  the 
undercount  rates  for  persons  in  the  same  race/sex/age 
categories  are  the  same  in  absolute  numbers  in  different 
subnational  units.  On  the  contrary,  this  method  assumes 
that  the  undercount  rates  for  persons  in  the  same 
race/sex/age  groups  are  different  in  different  subnational 
localities,  but  that  those  differences  are  not  statistically 
significant.  In  other  words,  for  example,  we  assume  that 
the  undercount  rate  for  black  males  ages  35  to  39  in 


Department  from  Census  Bureau  undercount  rates  in  Jacob  Siegel. 
"Estimates  of  Coverage  of  the  Population  by  Sex,  Race,  and  Age  in 
the  1970  Census,"  U.S.  Census  Bureau,  1973;  and  7570  General 
Population  Characteristics.  Reprinted  from  Robert  B.  Hill  and 
Robert  B.  Steffes,  "Estimating  the  1970  Census  Undercount  for 
State  and  Local  Areas,"  NUL  Research  Department,  Washington, 
D.C.  July  23,  1973. 

Birmingham  will  differ  in  absolute  numbers  from  the 
undercount  rate  for  black  males  35  to  39  in  Detroit,  but 
that  neither  undercount  rate  will  be  statistically  different 
from  the  17.8  percent  undercount  rate  for  black  males  in 
that  age  category  at  the  national  level. 

Deriving  Local  Undercounts 

The  synthetic  method  can  easily  be  used  by  nonsta- 
tisticians  to  derive  the  census  undercount  for  any  areas  below 
the  national  level.  One  only  needs  to  apply  the  appropriate 
percentage  increase  to  the  official  1970  census  count  for 
specific  race/sex/age  groups  in  specific  localities.  For 
example,  an  undercount  rate  of  14.4  percent  for  black  males 
ages  30  to  34  requires  that  the  official  (or  published)  census 
count  of  black  men  30  to  34  years  old  in  a  locality  be 
increased  (or  inflated)  by  16.8  percent  (or,  more  appropri- 
ately, by  116.8  percent).  Similarly,  an  undercount  rate  of  9.7 
percent  for  black  females  under  the  age  of  5  requires  that  the 
official  census  count  of  black  females  under  5  be  increased 
by  10.7  percent  (or  by  110.7  percent).  This  step  is  repeated 
for  each  age  category  in  each  of  the  four  key  subgroups: 
black  males,  black  females,  white  males,  and  white  females. 
Using  the  percentage  increases  listed  in  table  2,  one  can 
derive  and  adjust  for  the  census  undercount  for  any  State  or 
local  area. 
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Table  3.  Estimated  Corrected  Count  of  U.S.  Population  by  State,  for  All  Races:  1970 

Percent  Published  Corrected  Amount  Percent 

State  undercount  count  count  undercount  distribution 

Total 

California 
New  York 
Illinois 
Texas 

Pennsylvania 
Ohio 
Michigan 
Florida 
New  Jersey 
North  Carolina 
Georgia 
Virginia 
Louisiana 
Massachusetts 
Indiana 
Maryland 
Missouri 
Alabama 
Tennessee 
Wisconsin 
South  Carolina 
Mississippi 
Minnesota 
Washington 
Kentucky 
Connecticut 
Oklahoma 
Iowa 
Arkansas 
Kansas 
Colorado 
District 
of  Columbia 
Oregon 
Arizona 
Hawaii 

West  Virginia 
Nebraska 
New  Mexico 
Utah 

Rhode  Island 
Maine 
Delaware 
Montana 
New  Hampshire 
South  Dakota 
Idaho 

North  Dakota 
Nevada 
Alaska 
Vermont 
Wyoming 

Source:  Prepared  by  the  National  Urban  League  Research  Department  from  data  in  Jacob  Siegel,  "Estimates  of  Coverage  of 
the  Population  by  Sex,  Race,  and  Age  in  the  1970  Census,"  U.S.  Census  Bureau,  1973;  and  1970  General  Population  Charac- 
teristics for  each  State. 


25 

203,209,700 

208,578,600 

5,369,192 

100 

26 

19,953,120 

20,478,160 

525,068 

9.8 

26 

18,236,960 

18,730,720 

493,774 

9.2 

2.7 

11,113,970 

11,417,090 

303,137 

5.6 

2.6 

11,196,730 

1 1 ,496,050 

299,338 

5.6 

2.4 

11,793,900 

1 2,078,090 

284,197 

5.3 

24 

10,652,010 

10,914,220 

263,227 

4.9 

25 

8,875,083 

9,106,498 

231,427 

4.3 

2.7 

6,789,443 

6,977,481 

187,748 

3.5 

25 

7,168,164 

7,354,424 

186,271 

3.5 

3.1 

5,082,059 

5,243,506 

161,459 

3.0 

3.3 

4,589,575 

4,743,749 

154,182 

2.9 

2.9 

4,648,494 

4,789,009 

140,526 

2.6 

3.4 

3,641,306 

3,768,703 

1 27,403 

2.4 

2.1 

5,689,170 

5,81  2,908 

1 23,749 

2.3 

23 

5,193,669 

5,316,073 

122,415 

2.3 

2.9 

3,922,399 

4,040,880 

1 1 8,489 

2.2 

25 

4,676,501 

4,794,681 

118,189 

2.2 

3.2 

3,444,165 

3,556,463 

112,305 

2.1 

2.7 

3,923,687 

4,032,421 

108,744 

2.0 

2.1 

4,417,731 

4,512,863 

95,145 

1.8 

3.4 

2,590,516 

2,682,448 

91,935 

1.7 

3.6 

2,216,912 

2,299,841 

82,932 

1.5 

2.0 

3,804,971 

3,883,745 

78,784 

1.5 

22 

3,409,169 

3,485,873 

76,714 

1.4 

23 

3,218,706 

3,293,598 

74,899 

1.4 

2.3 

3,031,709 

3,102,706 

71,004 

1.3 

2.5 

2,559,229 

2,623,726 

64,502 

1.2 

2.0 

2,824,376 

2,882,142 

57,772 

1.1 

2.7 

1,923,294 

1 ,977,642 

54,350 

1.0 

2.2 

2,246,578 

2,297,738 

51,162 

1.0 

2.2 

2,207,259 

2,256,692 

49,434 

0.9 

5.8 

756,510 

803,224 

46,715 

0.9 

2.1 

2,091,385 

2,135,577 

44,193 

0.8 

2.4 

1,770,900 

1,814,107 

43,209 

0.8 

5.3 

768,561 

811,235 

42,676 

0.8 

2.0 

1,744,237 

1,780,625 

36,390 

0.7 

2.1 

1 ,483,493 

1,515,556 

32,064 

0.6 

24 

1,016,000 

1,041,203 

25,204 

0.5 

2.0 

1,059,273 

1,081,384 

22,112 

0.4 

21 

946,725 

967,122 

20,398 

0.4 

20 

992,048 

1,011,813 

19,765 

0.4 

27 

548,104 

563,440 

1  5,337 

0.3 

2.2 

694,409 

709,741 

1  5,333 

0.3 

20 

737,681 

752,536 

14,856 

0.3 

22 

665,507 

680,286 

14,780 

0.3 

20 

712,567 

727,217 

14,651 

0.3 

2.1 

617,761 

630,782 

13,022 

0.2 

24 

488,738 

500,947 

12,210 

0.2 

3.3 

300,382 

310,492 

10,111 

0.2 

2.0 

444,330 

453,183 

8,853 

0.2 

21 

332,416 

339,444 

7,029 

0.1 
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Table  4.  Estimated  Corrected  Count  of  U.S.  Population  by  State,  for  Blacks  and  Other  Races:  1970 


State 

Total 

New  York 
California 
Illinois 
Texas 
Georgia 
North  Carolina 
Michigan 
Florida 
Pennsylvania 
Louisiana 
Ohio 
Virginia 
New  Jersey 
Alabama 
South  Carolina 
Mississippi 
Maryland 
Tennessee 
District 
of  Columbia 
Hawaii 
Missouri 
Indiana 
Arkansas 
Oklahoma 
Kentucky 
Massachusetts 
Connecticut 
Washington 
Arizona 
Wisconsin 
Kansas 
Colorado 
New  Mexico 
Delaware 
Minnesota 
Alaska 

West  Virginia 
Oregon 
Nebraska 
Nevada 
Iowa 

South  Dakota 
Rhode  Island 
Montana 
Utah 

North  Dakota 
Idaho 
Wyoming 
Maine 

New  Hampshire 
Vermont 


Percent  Published  Corrected  Amount  Percent 

undercount  count  count  undercount  distribution 


6.9 

25,462,850 

27,352,110 

1,889,290 

100 

7.2 

2,402,877 

2,589,224 

186,352 

9.8 

7.3 

2,192,102 

2,363,612 

171,515 

9.0 

7.0 

1,513,595 

1,628,169 

114,575 

6.0 

6.8 

1,479,602 

1,588,325 

108,724 

5.7 

6.7 

1,198,333 

1,284,906 

86,574 

4.5 

6.7 

1,180,292 

1,264,918 

84,627 

4.4 

7.0 

1,041,609 

1,120,520 

78,912 

4.1 

6.9 

1,070,100 

1,148,807 

78,708 

4.1 

6.9 

1,056,177 

1,134,244 

78,068 

4.1 

6.6 

1,099,808 

1,177,678 

77,871 

4.1 

6.9 

1 ,005,020 

1 ,079,694 

74,675 

3.9 

6.9 

886,980 

952,585 

65,605 

3.4 

7.1 

818,256 

881,242 

62,987 

3.3 

6.4 

91 0,334 

973,042 

62,708 

3.3 

6.6 

796,086 

852,589 

56,504 

2.9 

6.3 

823,629 

879,351 

55,723 

2.9 

7.1 

727,51 1 

782,772 

55,262 

2.9 

6.6 

629,757 

673,970 

44,213 

2.3 

7.2 

547,238 

589,574 

42,337 

2.2 

7.2 

470,401 

506,775 

36,375 

1.9 

6.7 

499,056 

535,173 

36,117 

1.9 

6.9 

373,345 

401,038 

27,694 

1.4 

6.3 

357,379 

381,225 

23,847 

1.2 

6.6 

278,867 

298,601 

19,735 

1.0 

6.6 

236,940 

253,698 

16,759 

0.8 

7.2 

211,546 

228,080 

16,535 

0.8 

7.3 

196,251 

211,661 

1 5,41 1 

0.8 

7.3 

158,114 

1 70,494 

12,381 

0.6 

6.9 

165,952 

1 78,301 

1 2,349 

0.6 

7.1 

158,772 

1 70,962 

12,191 

0.6 

7.0 

124,510 

133,824 

9,314 

0.4 

7.3 

94,907 

102,427 

7.520 

0.3 

7.0 

100,185 

107,673 

7,488 

0.3 

7.0 

81,645 

87,781 

6,136 

0.3 

7.3 

68,933 

74,328 

5,395 

0.2 

7.3 

63,61  5 

68,636 

5,021 

0.2 

6.1 

70,757 

75,313 

4,556 

0.2 

7.0 

59,306 

63,803 

4,497 

0.2 

7.0 

50,626 

54,41  5 

3,789 

0.2 

7.4 

40,561 

43,792 

3,231 

0.1 

7.1 

41,614 

44,773 

3,159 

0.1 

6.9 

35,174 

37,787 

2,613 

0.1 

7.4 

31,968 

34,527 

2,559 

0.1 

7.0 

31,366 

33,714 

2,348 

0.1 

7.0 

27,347 

29,392 

2,045 

0.1 

7.2 

18,276 

1 9,683 

1,407 

0.0 

7.2 

13,765 

14,832 

1,067 

0.0 

7.1 

9,392 

10,107 

715 

0.0 

7.6 

6,772 

7,331 

559 

0.0 

7.7 

4,575 

4,956 

381 

0.0 

7.9 

1,777 

1,929 

152 

0.0 

Source:  Prepared  by  the  National  Urban  League  Research  Department  from  data  in  Jacob  Siegel,  "Estimates  of  Coverage  of 
the  Population  by  Sex,  Race,  and  Age  in  the  1970  Census,"  U.S.  Census  Bureau,  1973;  and  1970  General  Population 
Characteristics  for  each  State. 
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PAST  USES  OF  SYNTHETIC  METHOD 

Since  1973,  the  synthetic  method  has  been  used  by 
several  nongovernmental  researchers  and  organizations  to 
correct  for  the  census  undercount.  However,  at  the  time  that 
the  Census  Bureau  released  its  analysis  of  the  undercount  in 
the  1970  census  by  race,  sex,  and  age  at  the  national  level  on 
April  25,  1973,  the  Bureau  indicated  that  it  had  not 
developed  a  method  for  distributing  the  national  undercount 
among  States  and  local  areas  [7] . 

Consequently,  the  National  Urban  League's  (NUL)  Re- 
search Department  decided  to  investigate  the  feasibility  of 
developing  estimates  of  the  census  undercount  for  areas  below 
the  national  level.  On  July  23,  1973,  the  NUL  Research 
Department  released  its  report,  "Estimating  the  1970  Census 
Undercount  for  States  and  Local  Areas,"  at  a  press  con- 
ference held  during  the  National  Urban  League's  Annual 
Conference  that  summer.  That  study  distributed  the  national 
undercount  of  5.3  million  persons  among  all  50  States  as  well 
as  among  36  selected  cities.  Moreover,  it  provided  crude 
estimates  of  the  amount  of  general  revenue  sharing  funds  lost 
by  50  States  and  20  cities  as  a  result  of  the  census 
undercount  [2] . 

Another  study  using  the  synthetic  method  to  adjust  for 
the  undercount  was  released  in  July  1973  by  I.R.  Savage  and 
B.M.  Windham  of  Florida  State  University.  However,  that 
report  derived  estimates  of  the  census  undercount  only  for 
the  State  of  Florida.  Moreover,  it  used  only  the  national 
undercount  rates  for  blacks  and  whites  by  race  to  adjust  for 
the  undercount.  Therefore,  variations  in  the  undercount 
among  blacks  (and  whites)  due  to  different  sex  and  age 
distributions  in  different  Florida  localities  were  not  systemat- 
ically incorporated  by  this  crude  version  of  the  synthetic 
method.  Thus,  this  study,  in  effect,  made  the  clearly 
untenable  assumption  that  the  undercount  rate  was  the  same 
for  all  blacks  (and  for  all  whites)  in  the  State  of  Florida 
regardless  of  their  sex  and  age  differences  [6] . 

Most  of  the  subsequent  research  studies  using  the  syn- 
thetic method  focused  on  the  impact  of  the  undercount  on 
Federal  grants-in-aid  to  States  and  local  areas.  Strauss  and 
Harkins  prepared  an  analysis  of  the  impact  of  the  census 
undercount  on  the  allocation  of  general  revenue  sharing 
(GRS)  funds  to  the  States  of  New  Jersey  and  Virginia  for  the 
Joint  Center  for  Political  Studies.  This  study  not  only 
corrected  for  the  population  component  in  the  general 
revenue  sharing  formula  by  using  the  national  undercount 
rates  for  specific  race/sex/age  groups,  but  also  updated  per 
capita  income  for  localities  by  race  and  sex  [10] . 

A  comprehensive  evaluation  of  the  impact  of  the  census 
undercount  on  GRS  allocations  to  subnational  jurisdictions 
throughout  the  country  was  conducted  by  the  Stanford 
Research  Institute  (SRI)  under  contract  to  the  Office  of 
Revenue  Sharing  of  the  U.S.  Treasury  Department.  But  SRI's 
mandate  was  much  broader  than  solely  assessing  the  signifi- 
cance of  adjusting  for  the  population  undercount  in  allo- 


cating GRS  funds.  Its  primary  objective  was  to  determine  the 
inequities  that  might  result  from  deficiencies  in  any  of  the 
major  data  elements  in  the  GRS  allocation  formulas,  i.e., 
total  population,  per  capita  income,  adjusted  local  taxes, 
urbanized  population,  personal  income.  State  and  local  taxes, 
State  individual  income  tax.  Federal  individual  income  tax 
liabilities,  and  intergovernmental  transfers  [9] . 

Some  of  the  major  defects  in  the  GRS  data  elements 
identified  by  SRI  were  lack  of  timeliness,  lack  of  compre- 
hensiveness, and  inaccuracies.  The  uneven  currency  of  the 
data  was  reflected  in  the  fact  that  while  updated  figures  for 
general  tax  efforts  were  available  for  all  39,000  jurisdictions, 
the  per  capita  income  figures  used  in  the  allocation  formula 
for  States  and  localities  were  based  on  1969  total  money 
income  from  the  1970  census.  Similarly,  while  updated 
annual  estimates  of  total  population  for  States  were  used  to 
allocate  GRS  funds,  the  total  population  figures  used  for  all 
sub- State  jurisdictions  were  from  the  1970  census.  Further- 
more, the  omission  of  5.3  million  uncounted  persons  from 
the  updated  State  figures  and  from  the  outdated  sub-State 
figures  compounded  the  incompleteness  and  inaccuracy  of 
the  population  figures  used  in  the  GRS  formula. 

Although  the  Stanford  Research  Institute's  GRS  data 
study  of  1974  revealed  that  population  was  not  as  significant 
a  factor  as  income  or  general  tax  effort  in  the  GRS  formula, 
it  nevertheless  concluded  that  the  equity  of  allocations  to  the 
50  States  and  the  District  of  Columbia  would  be  increased  if 
the  synthetic  method  were  used  to  adjust  for  the  population 
undercount  at  the  State  level.  Unfortunately,  SRI's  recom- 
mendation to  improve  the  accuracy  of  the  population  data  in 
the  GRS  formula  by  adjusting  for  the  census  undercount  has 
not  yet  been  implemented  by  the  Office  of  Revenue  Sharing. 

At  a  meeting  in  May  1974,  a  black  leadership  group 
(which  was  the  forerunner  of  the  Census  Advisory  Com- 
mittee on  the  Black  Population  for  the  1980  Census) 
requested  that  the  Census  Bureau  prepare  its  own  assessment 
of  the  impact  of  the  census  undercount  on  the  distribution 
of  Federal  funds  and  political  representation  to  States  and 
local  areas.  That  study,  formally  issued  in  August  1975,  used 
several  versions  of  the  synthetic  method  based  on  different 
assumptions  to  derive  various  estimates  of  the  census 
undercount  for  States.  It  concluded  that,  in  general,  rela- 
tively small  shifts  in  the  distribution  of  GRS  funds  and  of 
political  representation  at  the  State  level  would  occur  if 
improvements  were  made  in  the  quality  of  population  and 
other  data  elements  in  allocation  formulas.  Moreover,  the 
Bureau  cautioned  that,  while  it  used  the  synthetic  method  to 
derive  illustrative  estimates  for  purposes  of  its  analysis,  this 
study  should  not  be  construed  as  an  endorsement  (or  as  a 
repudiation)  of  the  synthetic  method  [11]. 

On  the  other  hand,  the  National  Commission  on  Employ- 
ment and  Unemployment  Statistics,  in  its  final  report  issued 
in  September  1979,  strongly  recommended  that  the  syn- 
thetic method  be  used  to  adjust  for  the  undercount  in 
governmental  labor  force  statistics.  It  argued  that  an  adjust- 
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merit  for  the  undercount  would  in  fact  be  smaller  in 
magnitude  than  the  adjustments  the  Census  Bureau  tradi- 
tionally make  to  account  for  underreports  (of  income  and 
unemployment,  for  example)  in  the  Current  Population 
Survey  (CPS): 

Because  the  uncounted  population  is  not  directly 
measured,  the  undercount  adjustment  of  labor  force 
statistics  would  require  an  assumption  that,  within  each 
demographic  group,  the  labor  force  status  of  persons 
missed  in  the  census  resembles  the  status  of  those 
counted.  Granted  that  this  comparability  assumption  is 
questionable,  it  still  may  be  less  objectionable  than 
assuming  that  unenumerated  persons  do  not  exist.  Fur- 
thermore, comparability  assumptions  are  already  made  in 
present  noninterview  adjustments  and  second-stage  ratio 
estimates.  The  percentage  rate  of  underreporting  in  the 
CPS  sample,  relative  to  controls  derived  from  the  census 
counts,  is  actually  larger  in  magnitude  than  the  census 
undercount  relative  to  the  independently  derived  esti- 
mates of  the  true  population  for  most  cells  in  table  8-5. 
This  CPS  underrepresentation  has  always  been  offset  by 
assuming  that  the  unreported  persons  have  the  same 
characteristics  as  enumerated  persons  in  the  same  age-sex- 
race  groups.  Adjusting  for  the  census  undercount  would 
merely  be  an  extension  of  present  practice— and  a  lesser 
magnitude— to  bring  the  CPS  estimates  in  line  with  the 
estimated  true  population  [3] . 

In  fact,  the  Commission's  recommendation  that  the  labor 
force  statistics  be  adjusted  for  the  undercount  was  a 
reaffirmation  of  the  position  of  its  predecessor,  the  Presi- 
dent's Committee  to  Appraise  Employment  and  Unemploy- 
ment Statistics,  which  was  appointed  by  President  John  F. 
Kennedy  in  1961.  In  its  report  released  in  1962,  this 
committee,  more  popularly  known  as  the  Gordon  Com- 
mittee, recommended  "that  in  .-the  preparation  of  the 
household  survey  estimates,  an  adjustment  be  introduced  to 
take  account  of  underrepresentation  of  the  population  in  the 
decennial  census."  However,  this  recommendation  was  not 
implemented  by  the  Census  Bureau  [3] . 

The  Panel  on  Decennial  Census  Plans  established  in  1978 
by  the  National  Research  Council  of  the  National  Academy 
of  Science  also  concluded  that  (a)  inequities  in  the 
allocation  of  Federal  funds  could  be  reduced  by  adjusting  for 
the  census  undercount  for  States  and  local  areas,  and  (b)  that 
feasible  methods  for  making  such  adjustments  already 
existed.  Yet,  the  Panel  stopped  short  of  recommending  (a) 
that  a  specific  method  be  used  to  make  such  adjustments  and 
(b)  that  the  Secretary  of  Commerce  direct  the  Census  Bureau 
to  make  these  adjustments  in  the  1980  census  population 
figures  for  States  and  local  areas  [5] . 

Thus,  this  overview  of  past  uses  of  the  synthetic  method 
reveals  that  there  is  widespread  consensus  among  nongovern- 
mental researchers  that  adjustments  for  the  census  under- 
count would  reduce  inequities  to  State  and  local  areas  and 
that  the  synthetic  method  is  one  feasible  method  for  making 
such  adjustments.  But  what  are  some  advantages  and  disad- 
vantages in  using  the  synthetic  method  for  such  purposes? 


ADVANTAGES  AND  DISADVANTAGES  OF 
SYNTHETIC  METHOD 

In  order  to  properly  assess  the  comparative  advantages 
and  disadvantages  of  using  the  synthetic  method,  it  is 
necessary  to  briefly  describe  the  basic  elements  of  other 
methods  that  have  been  used  in  the  past  to  derive  estimates 
of  the  census  undercount— most  especially,  the  matching  and 
demographic  methods. 

Matching  Method 

The  matching  method  was  the  principal  means  of  deriving 
estimates  of  the  census  undercount  in  the  1950  census.  It 
involves  matching  the  results  from  at  least  two  independent 
sources.  This  has  often  meant  comparing  the  characteristics 
of  persons  in  the  census  with  (a)  the  characteristics  of 
persons  obtained  in  the  monthly  Current  Population  Survey 
(CPS)  of  about  50,000  households  or  (b)  with  the  charac- 
teristics of  persons  in  an  independent  list  or  register. 
However,  since  lists  by  their  very  nature  are  incomplete  and 
relate  to  only  segments  of  the  population  (such  as  heads  of 
households,  children  who  are  students,  medicare  enrollees, 
and  automobile  owners),  more  than  one  list  would  often  be 
needed  in  order  to  properly  match  up  with  a  census 
household  roster.  Obviously,  a  major  difficultly  in  addition 
to  finding  a  more  complete  listing  is  to  obtain  a  list  that  is 
similar  in  time  to  the  census.  Estimates  of  the  undercount 
were  derived  for  1970  by  matching  the  census  to  the  CPS. 
But  males  did  not  show  a  higher  undercount  rate  than 
females  according  to  the  matching  estimates,  although  this 
relationship  has  been  consistently  documented,  based  on 
demographic  analyses.  This  discrepancy  is  most  likely  due  to 
correlation  bias:  persons  who  are  missed  in  the  census  are 
also  more  likely  to  be  missed  in  the  CPS  [5] . 

Demographic  Method 

The  method  of  demographic  analysis  has  been  the  basic 
procedure  for  deriving  national  estimates  of  the  undercount 
for  censuses  in  1950,  1960,  and  1970.  It  involves  comparing 
the  actual  number  of  persons  enumerated  in  the  census  with 
the  number  of  persons  "expected"  to  be  counted.  The 
"expected"  or  "true"  population  is  derived  independently  of 
the  census  and  refers  to  "the  number  born  minus  the  number 
who  have  died,  adjusted  to  account  for  the  number  who  have 
moved  into  or  out  of  the  country."  Thus,  a  variety  of  sources 
are  used:  Birth  records,  death  certificates,  Medicare  records, 
and  immigration  and  emigration  statistics  [5] . 

Clearly,  the  accuracy  of  estimates  derived  by  the  demo- 
graphic method  is  directly  related  to  the  quality  of  data  from 
records.  The  existence  of  large  numbers  of  undercounted  and 
uncounted  illegal  aliens  underscores  the  unreliability  of 
immigration  statistics.  But  statistics  on  emigrants  leaving  the 
United  States,  especially  those  that  have  been  collected  by 
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the  Immigration  and  Naturalization  Service,  are  similarly  of 
questionable  reliability. 

In  addition,  indepth  analyses  by  the  Census  Bureau  have 
revealed  that  State-of-birth  records  for  many  persons, 
especially  blacks,  inappropriately  refer  to  one's  current  State 
of  residence  rather  than  to  one's  actual  State  of  birth.  Thus, 
State-of-birth  data  are  often  unreliable  because  of  under- 
reporting and  misreporting. 

Furthermore,  since  ethnic  origin  is  rarely  recorded  on  the 
birth  and  death  records  for  most  States  (not  to  mention 
internal  migration  information),  it  has  not  been  possible  to 
derive  national  estimates  of  the  undercount  for  such  groups 
as  Hispanics,  using  the  demographic  method. 

Consequently,  because  of  the  incomplete  and  uneven 
quality  of  birth,  death,  and  most  migration  data  for  various 
racial,  ethnic,  and  age  groups  in  different  States,  a  number  of 
questionable  assumptions  must  be  made  to  derive  national 
undercount  estimates  by  the  demographic  method. 

Imputations 

A  procedure  that  is  not  commonly  perceived  as  a  method 
for  deriving  estimates  of  the  undercount  (but  which  in  effect 
does  just  that)  is  that  of  imputation.  Imputation  is  a  method 
by  which  the  Census  Bureau  imputes  the  existence  of  persons 
not  contacted  by  the  census  or  allocates  characteristics  to 
persons  who  were  enumerated  in  the  census  but  for  whom 
certain  traits  (such  as  race,  sex,  and  age)  are  missing  from  the 
census  forms  [5] . 

Many  opponents  of  recommendations  to  adjust  for  the 
census  undercount  usually  assume  that  the  official  census 
counts  consist  solely  of  persons  who  were  actually  contacted 
or  interviewed  and  contain  no  adjustments  for  persons 
missed  or  not  contacted.  For  decades,  the  Census  Bureau  has 
imputed  the  existence  of  persons  or  their  characteristics 
because  of  inadequacies  in  the  enumeration  procedure  or  in 
the  computerized  processing  of  data.  And  the  Bureau  has 
regularly  published  these  imputed  figures  for  all  concerned 
parties  to  examine  [14] . 

In  the  1970  census,  for  example,  4.9  million  persons  (or 
2.4  percent  of  203  million  persons)  were  imputed— 2.7 
million  because  of  enumeration  deficiencies  and  2.2  million 
because  of  processing  failures.  Some  of  the  enumeration 
inadequacies  include  (a)  "closeout"  cases,  i.e.,  households 
that  were  visited  several  times,  but  no  one  was  found  to  be  at 
home;  (b)  households  that  refused  to  cooperate;  (c)  persons 
whose  existence  was  determined  after  the  census  enumera- 
tion by  a  post  office  check;  and  (d)  housing  units  that  were 
recorded  as  vacant,  but  which  may  have  been  occupied. 
Examples  of  processing  failures  include  instances  in  which 
questionnaires  may  not  have  been  properly  microfilmed  or 
read  by  the  automatic  data  processing  system  (i.e., 
FOSDIC— Film  Optical  Sensing  Device  for  Input  to 
Computers.) 

Characteristics  are  imputed  or  allocated  to  unenumerated 


persons  and  households  by  randomly  assigning  charac- 
teristics from  other  persons  and  households  in  their 
neighborhood.  When  a  deficient  record  emerges,  charac- 
teristics from  the  most  recently  processed  record  are  com- 
pletely or  partially  duplicated  and  assigned  to  the  record 
with  the  processing  information. 

Obviously,  some  of  these  imputation  methods  are  ques- 
tionable. For  example,  uncontacted  blacks  living  in  predomi- 
nately white  areas  and  uncontacted  whites  living  in 
predominantly  black  areas  would  most  likely  be  assigned 
characteristics  from  households  of  the  opposite  race,  based 
on  these  allocation  procedures.  Moreover,  while  national 
vacancy  surveys  may  reveal  that  every  nth  housing  unit  that 
is  designated  as  vacant  is  in  fact  occupied,  the  random 
assigning  of  people  and  characteristics  of  households  to  every 
nth  vacant  unit  is  highly  tenuous. 

In  short,  the  official  census  counts  used  for  congressional 
apportionment  and  governmental  grants-in-aid  traditionally 
contain  imputed  or  adjusted  numbers  for  millions  of  persons 
who  were  never  contacted  or  interviewed  by  census  enumera- 
tors. Thus,  the  Census  Bureau's  published  national  estimates 
of  the  undercount  do  not  reflect,  as  is  popularly  assumed, 
the  universe  of  persons  not  contacted  by  the  census.  In  other 
words,  the  "actual"  number  of  persons  not  contacted  in  the 
1970  census  was  10.2  million.  However,  since  4.9  million  of 
them  were  imputed  by  allocation  procedures,  this  left  the 
residual  of  5.3  million  still  uncounted.  Consequently,  adding 
the  5.3  million  uncounted  persons  to  the  official  count 
would  merely  be  an  extension  of  current  imputation  and 
adjustment  procedures. 

Comparative  Criteria 

We  will  now  review  the  comparative  advantages  and 
disadvantages  of  the  synthetic  method  according  to  the 
following  criteria:  Internal  consistency,  simplicity,  time- 
liness, flexibility,  equity,  and  reliability. 

1.  Internal  Consistency.  Ideally,  any  method  for  deriving 
subnational  estimates  of  the  undercount  should  yield  esti- 
mates that  are  internally  consistent  with  those  at  the  national 
level  or  at  the  next  larger  geographical  context.  Internal 
consistency  of  undercount  figures  would  greatly  facilitate  the 
proper  apportioning  of  governmental  funds  or  political 
representation  among  all  geographical  subunits.  For  example, 
undercount  figures  for  an  aggregate  of  cities  should  equal 
those  for  their  respective  counties,  which,  in  turn,  should 
equal  the  figures  for  their  respective  SMSA's,  which,  in  turn, 
should  equal  the  figures  for  their  respective  States,  etc. 

While  the  demographic  method  appears  to  be  the  most 
reliable  procedure  for  deriving  undercount  estimates  by 
race/sex/age  at  the  national  level,  it  is  not  the  most  feasible 
for  yielding  internally  consistent  estimates  below  the 
national  level.  In  fact,  the  Census  Bureau's  own  indepth 
research  in  this  area  reveals  that  the  demographic  method 
would  produce  undercount  estimates  for  States  that  would 
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be  widely  inconsistent  with  national  estimates  because  of  the 
uneven  quality  of  birth  and  death  records  and  the  lack  of 
adequate  internal  migration  data  between  States.  Highly 
questionable  adjustments  would  be  needed  to  make  those 
States'  figures  congruent  with  nationwide  figures  [13]  . 

The  matching  method,  on  the  other  hand,  would  yield 
even  less  reliable  and  consistent  undercount  estimates  for 
States  than  the  demographic  method.  Because  of  the  low 
probability  of  securing  adequate  and  timely  lists  or  rosters  of 
household  composition,  the  matching  method  appears  to  be 
one  of  the  least  feasible  for  producing  subnational  under- 
count estimates  that  would  be  internally  consistent  with 
national  estimates.  Adjustments  of  such  an  arbitrary  nature 
would  be  required  to  make  the  widely  varying  undercount 
estimates  for  States  derived  by  the  matching  method 
consistent  with  national  estimates  to  be  meaningless  and 
indefensible. 

The  synthetic  method,  by  definition,  yields  undercount 
estimates  for  subnational  units  that  are  internally  consistent 
with  national  estimates.  Thus,  no  further  adjustments  of 
these  estimates  would  be  needed,  as  in  the  case  of  the 
demographic  or  matching  methods. 

In  addition,  the  imputation  method  could  be  used  to 
allocate  synthetic  subnational  estimates  of  the  undercount  to 
specific  household  records  consistent  with  national  estimates. 
Using  an  approach  similar  to  that  by  the  Census  Bureau  in 
assigning  persons  to  vacant  units,  one  could  randomly 
apportion  unenumerated  persons  by  race,  sex,  and  age  to 
both  enumerated  and  unenumerated  households  based  on  the 
synthetically  derived  undercount  estimates  for  specific 
subnational  geographic  areas. 

2.  Simplicity.  The  synthetic  method  is  by  far  the  simplest 
to  apply  in  deriving  subnational  undercount  estimates.  This 
simplicity  is  a  result  of  the  fact  that  only  one  assumption 
needs  to  be  made  in  order  to  implement  it— the  null 
hypothesis.  Since  each  of  the  other  methods— demographic, 
matching,  and  imputation— would  require  different  assump- 
tions and  adjustments,  depending  on  the  quality  of  the  data 
available  in  different  States,  much  more  sophisticated  statis- 
tical skills  and  training  would  be  needed  by  users  to  apply 
these  methods. 

On  the  other  hand,  the  synthetic  method  can  easily  be 
applied  by  nonstatisticians,  such  as  congressional  aides, 
representatives  of  community-based  organizations,  and  other 
research-oriented  individuals  and  public  interest  groups. 

3.  Timeliness.  Timeliness  is  an  essential  feature  for 
assessing  the  viability  of  a  method  for  adjusting  the  under- 
count. Clearly,  a  method  that  required  more  than  half  a 
decade  to  derive  subnational  undercount  estimates  would  be 
of  little  practical  utility. 

Ideally,  one  would  like  to  use  a  method  that  could  adjust 
for  the  undercount  before  the  official  counts  are  turned  over 
to  the  President  and  Congress.  Since  the  Census  Bureau 
customarily  uses  the  imputation  method  to  assign  persons 


and  characteristics  to  specific  households  before  the  census 
enumeration  is  completed,  this  approach  has  the  advantage 
over  the  other  procedures  with  regard  to  earliest  imple- 
mentation. 

Subnational  estimates  of  the  undercount  using  the 
demographic  method  would  not  be  possible  until  after  the 
national  estimates  were  derived  by  this  procedure.  Since  it 
took  about  3  years  to  derive  the  national  estimates  of  the 
undercount  in  the  1970  census,  one  could  not  expect  to 
obtain  national  estimates  of  the  undercount  using  the 
demographic  method  before  1982.  Additional  years  would 
be  required  to  derive  reliable  estimates  of  the  undercount  by 
the  demographic  method  for  all  50  States.  Somewhat  similar 
protracted  timing  would  be  anticipated,  using  the  matching 
method  to  derive  reliable  estimates  of  the  undercount  for 
each  of  the  50  States  and  the  District  of  Columbia.  It  is  not 
even  possible  to  estimate  the  amount  of  time  that  would  be 
needed  to  derive  reliable  estimates  of  the  undercount  for 
geographical  areas  below  the  State  level  using  either  the 
demographic  or  matching  method. 

However,  once  national  estimates  of  the  undercount  were 
derived  by  another  procedure,  the  synthetic  method  could  be 
used  to  immediately  derive  subnational  estimates  of  the 
undercount  for  all  jurisdictions  below  the  national  level.  If, 
for  example,  national  estimates  of  the  undercount  were 
derived  by  the  demographic  method  by  1982,  subnational 
estimates  could  be  derived  for  all  jurisdictions  within  this 
Nation  within  a  matter  of  weeks  by  using  the  synthetic 
method. 

Since  the  primary  objective  for  making  adjustments  below 
the  national  level  is  to  minimize  the  inequities  to  States  and 
local  areas  as  a  result  of  the  undercount,  a  method  that  can 
reliably  make  such  adjustments  within  a  relatively  short 
period  of  time  is  most  desirable. 

4.  Flexibility.  Flexibility  is  another  important  attribute  of 
a  method  for  deriving  undercount  estimates.  More  specifi- 
cally, a  method  that  could  produce  undercount  estimates  for 
all  jurisdictions  regardless  of  size  would  be  highly  desirable 
and  useful  to  most  elected  officials  and  other  policymakers. 

Unfortunately,  most  moderate  and  small-size  jurisdictions 
are  traditionally  penalized  for  their  size.  Since  it  is  easier  to 
derive  more  reliable  estimates  (of  the  undercount,  population 
projections,  unemployment  figures,  etc.)  for  larger  jurisdic- 
tions, smaller  size  areas  are  usually  excluded. 

However,  since  the  primary  objective  for  adjusting  for  the 
census  undercount  is  to  reduce  inequities  to  States  and  local 
areas,  we  feel  that  a  method  that  does  not  exclude  small 
areas  from  such  adjustments  should  be  given  much  greater 
weight  than  procedures  that  tend  to  favor  larger  size 
jurisdictions.  For  example,  27,000  (or  69  percent)  of  the 
39,000  units  of  government  receiving  general  revenue  sharing 
funds  had  populations  under  2,500. 

The  synthetic  method  appears  to  have  greater  flexibility 
than  either  the  demographic  or  matching  method  in  its 
ability  to  derive  undercount  estimates  for  all  jurisdictions 
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regardless  of  size.  Another  important  issue  regarding  the 
flexibility  of  adjustment  procedures  is  the  ability  to  produce 
undercount  estimates  for  other  major  ethnic  groups,  such  as 
Hispanics. 

The  extensive  exploratory  research  conducted  by  the 
Census  Bureau  in  this  area  concluded  that  it  was  very 
difficult  to  derive  undercount  estimates  for  Hispanics  at  the 
national  level  using  the  demographic  method  because  of  the 
absence  of  Hispanic  origin  and  inconsistent  Hispanic  classi- 
fications in  birth  and  death   records  in  most  States  [12]. 

Theoretically,  the  matching  method  could  produce  under- 
count estimates  for  Hispanics,  if  reliable  lists  or  rosters  of 
Hispanic  households  existed  or  if  special-purpose  Hispanic 
household  surveys  were  conducted.  Unfortunately,  such  lists 
are  yet  to  be  found  and  such  surveys  are  yet  to  be 
conducted.  Thus,  for  all  practical  purposes,  the  feasibility  of 
the  matching  method  for  such  purposes  has  yet  to  be 
demonstrated. 

On  the  other  hand,  the  imputation  method  appears  to 
have  the  greatest  potential  for  deriving  undercount  estimates 
for  Hispanics.  As  a  start,  initial  estimates  might  be  derived 
based  on  the  nature  and  numbers  of  imputed  persons  of 
Hispanic  origin  and  on  the  extent  of  allocations  of  charac- 
teristics to  Hispanic  households. 

The  ability  of  the  synthetic  method  to  derive  subnational 
undercount  estimates  for  Hispanics  is  directly  constrained  by 
the  flexibility  of  the  method  used  to  derive  the  national 
estimates.  Thus,  synthetic  methods  that  are  based  on 
national  estimates  derived  by  the  demographic  method  are 
not  able  to  derive  estimates  of  the  undercount  for  Hispanics. 
But  synthetic  methods  that  are  based  on  national  estimates 
derived  by  either  the  imputation  or  matching  method  would 
be  able  to  derive  subnational  undercount  estimates  for 
Hispanics.  Some  exploratory  work  by  the  NUL  Research 
Department  strongly  suggests  subnational  undercount  esti- 
mates for  Hispanics  can  be  developed  by  using  a  synthetic 
method  based  on  national  estimates  derived  from  imputation 
procedures. 

5.  Equity.  Most  discussions  about  the  relative  equity  of 
different  undercount  adjustment  methods  fail  to  distinguish 
between  two  types  of  equity:  statistical  and  political. 
Statistical  equity  results  from  enhancing  the  quality  of 
statistics  that  may  be  used  as  individual  data  elements  in  a 
distribution  formula,  while  political  equity  results  from  the 
actual  distribution  of  resources  according  to  a  configuration 
of  all  the  elements  in  a  formula.  Enhancing  statistical  equity 
does  not  necessarily  lead  to  greater  political  equity.  The  two 
vary  independently  of  each  other  and  should  be  kept 
conceptually  distinct. 

For  example,  adjusting  population  figures  for  the  under- 
count in  the  general  revenue  sharing  formula  (or  even  in 
formulas  in  which  population  is  the  major  data  element)  by 
the  most  reliable  method  possible  will  not  ensure  that  there 
will  be  a  more  equitable  distribution  of  resources  to  States 
and  local  areas  that  are  undercount-prone. 


The  distribution  of  resources  to  areas  is  primarily  a 
political,  not  a  technical,  determination.  Statisticians  and 
technicians  can  be  very  influential  in  (a)  recommending  the 
data  elements  to  be  used  in  a  distribution  formula  and  (b)  in 
determining  the  quality  of  the  data  elements.  But  politicians 
are  the  primary  determinants  of  the  final  configuration  of 
data  elements  in  allocation  formulas.  This  phenomenon  is 
popularly  referred  to  as  "computer  politics"  or  "politics  by 
printout": 

The  formula  is  a  tool  for  performing  a  very  old  political 
balancing  act:  putting  the  money  where  the  needs  are 
while  making  sure  that  every  congressional  district  gets 
something.  Formulas  are  supposed  to  provide  a  fair, 
objective  distribution  of  Federal  aid.  But  formula 
elements  are  chosen  politically  and  seemingly  minor 
changes  can  mean  boom  or  bust  for  some  recipients  of 
aid.  .  .  .Formulas  are  modified  to  accomodate  some  sta- 
tistical and  political  realities.  .  .  . 

These  compromises  illustrate  a  tension  in  formula  building 
between  the  technician's  desire  to  draft  a  theoretically 
pure  statistical  model  based  on  objective  data  and  the 
politician's  need  to  find  a  mathematically  plausible  way  to 
put  money  where  both  the  needs  and  the  votes  are  [8] . 

Consequently,  it  is  not  fruitful  for  technicians  to  engage 
in  extended  debates  about  the  amount  of  (political)  equity 
that  would  result  from  the  distribution  of  resources  to  areas 
based  solely  on  the  improvement  of  the  quality  of  particular 
data  elements  (i.e.,  statistical  equity).  Any  simulations  done 
should  be  viewed  simply  as  illustrative. 

It  is  very  likely  that,  if  some  adjustment  of  population 
figures  for  the  undercount  was  adopted  by  Congress  to  use  in 
existing  allocation  formulas,  the  formulas  would  not  remain 
the  same.  In  all  probability,  the  configuration  of  the  data 
elements  in  existing  formulas  would  almost  certainly  be 
modified  to  satisfy  current  political  realities. 

Thus,  the  primary  objective  of  technicians  should  be  to 
determine  the  most  reliable  and  useful  method  for  improving 
the  quality  of  population  figures,  regardless  of  how  that  data 
element  may  eventually  be  used  in  different  allocation 
formulas.  In  short,  the  merits  of  one  adjustment  method  over 
another  should  be  assessed  on  statistical,  and  not  political, 
equity. 

On  balance,  we  feel  the  synthetic  and  imputation  methods 
could  provide  greater  statistical  equity  to  subnational  juris- 
dictions than  either  the  demographic  or  matching  method. 
The  ability  of  the  synthetic  method  and  imputations  to 
provide  estimates  of  the  undercount  for  all  geographical 
areas— regardless  of  size— is  a  major  contributor  to  statistical 
equity. 

Such  simultaneous  adjustments  for  all  jurisdictions  would 
give  smaller  jurisdictions  the  same  probability  of  benefiting 
(or  losing)  as  larger  jurisdictions. 

It  should  be  kept  in  mind  that  these  smaller  jurisdictions 
will  still  be  penalized  for  their  size  in  other  ways.  Adjustment 
of   the    census    undercount   only   reflects  the   size   of  the 
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population  on  Census  Day,  April  1,  1980.  Since  it  is  more 
difficult  to  derive  reliable  postcensal  estimates  of  population 
for  small  areas,  their  1980  population  figures  may  be  used 
for  the  entire  decade  as  a  basis  for  governmental  allocations, 
while  larger  areas  would  be  able  to  have  updated  population 
figures  used  in  allocation  formulas.  Of  course,  small  areas 
that  lost  population  would  prefer  to  use  the  1980  figure, 
while  those  gaining  population  would  want  to  use  updated 
estimates  in  order  to  gain  political  equity. 

The  ability  of  the  synthetic  method  to  derive  subnational 
estimates  of  the  undercount  for  ethnic  groups,  such  as 
Hispanics,  further  enhances  its  (statistical)  equity.  Of  course, 
this  can  only  be  accomplished  after  national  estimates  of  the 
undercount  are  first  derived  through  either  the  imputation  or 
matching  method. 

Based  on  its  exhaustive  assessment  regarding  the  data 
elements  in  the  general  revenue  sharing  formula,  the  Stanford 
Research  Institute  concluded  that  equity  would  be  enhanced 
by  using  the  synthetic  method  to  adjust  for  the  census 
undercount  at  the  State  level: 

Equity  of  allocations  to  the  50  States  and  the  District  of 
Columbia  can  be  increased  by  adjusting  at  the  State  level 
for  underenumeration,  using  the  national  age/sex/race 
underenumeration  rates  prepared  by  the  Bureau  of  the 
Census.  If  the  national  rates  are  used  to  adjust  for 
underenumeration  at  the  county-area  and  local  govern- 
ment levels,  equity  of  allocations  is  likely  to  increase  for 
larger  jurisdictions  and  to  decrease  for  many  smaller 
jurisdictions.... 

It  is  SRI's  judgment,  however,  that  on  balance,  the 
accuracy  of  the  State-area  population  estimates  would  be 
improved  as  a  result  of  this  procedure.  Although  it  was 
not  possible  to  prove  that  increased  equity  would  result, 
the  reduction  of  biases  due  to  underenumeration  at  the 
State  level  is  viewed  as  a  positive  step  by  SRI  [9] . 

However,  SRI  was  not  quite  as  confident  about  the  extent  of 
jquity  that  could  be  gained  by  using  the  synthetic  method  to 
adjust  for  the  undercount  below  the  State  level: 

The  adjustment  for  underenumeration  at  the  county  level 
would  generate  negligible  changes  in  equity,  on  the 
average.  Furthermore,  the  assumption  that  the  national 
underenumeration  rates  for  the  96  age,  race,  and  sex 
categories  apply  uniformly  across  all  county  areas  is 
difficult  to  defend.  Even  an  unjustified  attempt  to  adjust 
for  underenumeration  effects  in  the  1970  census  data  is 
considered  by  some  to  be  better  than  no  attempt  at  all. 
However,  from  an  overall  point  of  view,  considering  all 
units  of  government,  the  increase  in  equity  is  question- 
able. Additional  research  is  needed  before  under- 
enumeration rates  can  be  accurately  portrayed  at  the  local 
level.  The  Bureau  of  the  Census  and  other  organizations 
are  urged  to  continue  and  accelerate  the  research   [9] . 

It  is  not  entirely  clear  whether  SRI  is  referring  only  to 
political  equity  or  to  statistical  equity,  or  to  both,  as  a  result 
of  adjustments  for  the  undercount  at  the  State  and  sub-State 


levels.  But  its  conclusions  are  largely  based  on  the  assump- 
tion that  equity  (whether  political  or  statistical)  is  inversely 
related  to  size  of  jurisdictions:  Greater  equity  will  be 
achieved  by  adjusting  for  the  undercount  for  large  jurisdic- 
tions (such  as  States)  and  little  equity  or,  possibly,  greater 
inequities  might  result  from  adjustments  for  small  areas  (such 
as  counties  and  cities).  We  do  not  think  that  such  an 
assumption  is  warranted  until  one  has  developed  illustrative 
allocation  formulas  that  clearly  distinguish  between  political 
and  statistical  equity. 

6.  Reliability.  While  it  is  highly  desirable  for  any  method 
used  to  adjust  for  the  undercount  to  produce  undercount 
estimates  that  are  internally  consistent,  timely,  flexible,  and 
equitable,  it  is  essential  that  such  estimates  be  reliable  and 
accurate.  The  demographic  method  is  widely  regarded  as 
producing  the  most  reliable  undercount  estimates  at  the 
national  level  at  present.  However,  the  Census  Bureau's  own 
studies  reveal  that  the  demographic  method  is  not  a  feasible 
procedure  for  deriving  reliable  estimates  of  the  undercount 
below  the  national  level  because  of  severe  deficiencies  in  vital 
statistics  and  migration  data  at  the  State  and  local  levels.  But 
the  matching  method  is  even  less  likely  than  the  demographic 
method  to  produce  reliable  estimates  of  the  undercount  at 
any  level— national,  State,  or  local. 

This  leaves  either  the  synthetic  method,  imputations,  or  a 
combination  of  both  as  a  basis  for  producing  reliable 
subnational  estimates  of  the  undercount.  There  is  growing 
consensus  among  research  analysts  that  the  synthetic  method 
is  a  viable  procedure  for  deriving  reliable  estimates  of  the 
undercount— at  least  at  the  State  level.  In  fact,  the  National 
Commission  on  Employment  and  Unemployment  Statistics 
strongly  urged  that  the  synthetic  method  be  used  to  adjust 
for  the  undercount  in  governmental  labor-force  statistics: 

Synthetic  estimates  of  undercoverage  for  each  State  by 
race  and  sex  must  be  viewed  with  extreme  caution.  A 
major  problem  is  that  interstate  migration  statistics  are 
not  collected.  But  even  without  reliable  State  undercount 
estimates.  State  and  local  data  could  be  adjusted 
according  to  national  undercount  estimates.  In  fact,  State 
estimates  are  presently  prepared  by  using  the  national 
second-stage  ratio  adjustment  weights.  The  question,  then, 
is  not  whether  to  adjust  area  data  to  national  population 
controls,  but  which  controls  to  use— incomplete  popula- 
tion figures  or  figures  adjusted  for  the  census 
undercount  [3] . 

Yet,  synthetic  adjustments  of  the  undercount  do  not  have 
the  same  reliability  for  areas  of  differing  sizes.  While  we  take 
exception  to  the  assumption  that  statistical  equity  is 
necessarily  inversely  related  to  geographic  size,  we  must  agree 
with  the  assumption  that  reliability  is  inversely  correlated 
with  size.  The  reliability  of  estimates  (whether  they  be 
population  projections,  unemployment  statistics,  income,  or 
the  census  undercount,  etc.)  is  clearly  a  function  of  the  size 
of  units.  While  synthetic  estimates  (or  estimates  derived  by 
any  other  method,  for  that  matter)  will  decline  in  reliability 
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the  smaller  the  jurisdiction,  we  do  not  yet  know  at  what  level 
the  unreliability  becomes  so  severe  that  no  estimates  should 
be  used. 

However,  in  order  to  ensure  that  every  jurisdiction, 
regardless  of  size,  has  an  equal  opportunity  to  have  its 
population  figures  adjusted  for  the  undercount,  we  still  feel 
that  synthetic  adjustments  should  be  made  for  all  sub- 
national  jurisdictions.  Since  most  small  areas  will  be 
penalized  later  by  not  being  able  to  have  updated  postcensal 
population  estimates  used  in  governmental  allocation 
formulas,  these  initial  adjustments  would  minimize  the 
statistical  inequities  to  small  areas. 

Moreover,  we  feel  that  such  adjustments  for  the  under- 
count are  preferable  to  making  the  current  assumption  that 
no  births,  deaths,  or  migration  occurred  in  most  small  areas 
over  a  period  of  10  years.  Strangely,  there  appears  to  be  little 
interest  in  determining  the  margin  of  error,  unreliability,  or 
inequity  that  accrues  to  small  areas  by  accepting  this  clearly 
fallacious  assumption  of  no  change  over  a  decade! 

CONCLUSIONS  AND  RECOMMENDATIONS 

The  census  undercount  has  significant  social,  economic, 
political,  and  legal  ramifications.  The  disproportionate  under- 
count of  blacks  and  other  minorities  not  only  severely  flaws 
adequate  planning  for  their  needs,  but  also  deprives  these 
groups  of  equitable  resources  and  services.  Moreover,  since 
population  figures  are  used  by  over  100  Federal  programs  to 
allocate  billions  of  dollars  each  year.  States  and  local  areas 
with  disproportionate  undercounts  are  deprived  of  their 
proper  share  of  governmental  grants-in-aid.  And,  since  the 
census  count  is  used  as  the  basis  for  allocating  representation 
in  State  and  local  policymaking  bodies,  as  well  as  in  the 
House  of  Representatives,  localities  with  disproportionate 
undercounts  are  also  deprived  of  their  equitable  share  of 
political  representation.  Thus,  such  inequities  require 
immediate  redress.  But  what  method  should  be  used  to 
correct  for  the  undercount  in  population  figures? 

There  is  widespread  agreement  that  the  demographic 
method  currently  produces  the  most  reliable  estimates  of  the 
undercount  at  the  national  level.  But  there  is  also  increasing 
consensus  among  researchers  that  the  synthetic  method  is  a 
viable  procedure  for  deriving  reliable  estimates  of  the 
undercount— at  least  at  the  State  level.  Our  analysis  strongly 
suggests  that  the  synthetic  method  is  the  most  appropriate 
procedure  for  deriving  subnational  undercount  estimates  that 
are  internally  consistent,  timely,  flexible,  and  equitable. 
Moreover,  the  method  is  relatively  simple  to  administer  and 
can  be  implemented  by  nonstatisticians.  While  the  reliability 
of  synthetic  estimates  for  larger  jurisdictions  appears  to  be 
high,  the  reliability  of  estimates  for  small  areas  is 
questionable. 

We  strongly  urge  further  research  into  the  feasibility  of 
deriving  national  and  subnational  undercount  estimates  based 
on  the  Census   Bureau's  imputation  procedures.  It  appears 


that  the  imputation  method  might  be  the  most  appropriate 
procedure  for  deriving  estimates  of  the  undercount  for  other 
ethnic  groups,  such  as  Hispanics,  Asian  Americans,  and 
Native  Americans.  Moreover,  the  imputation  method  is 
superior  to  all  other  methods  in  its  ability  to  be  implemented 
earliest.  Imputed  or  allocated  figures  adjusted  for  the  census 
undercount  would  be  incorporated  in  the  official  counts 
before  they  are  turned  over  to  the  President  and  Congress. 

Key  Recommendations 

1.  We  strongly  recommend  use  of  the  synthetic  method 
as  a  viable  procedure  for  deriving  estimates  of  the 
undercount  for  States  and  local  areas,  regardless  of 
size. 

2.  We  further  urge  that  synthetic  estimates  of  the 
undercount  for  States  and  local  areas  be  used  in 
governmental  grants-in-aid  distribution  formulas  to 
those  areas. 

3.  We  urge  serious  consideration  be  given  to  using 
imputed  figures  adjusted  for  the  census  undercount  in 
the  official  census  counts  that  are  turned  over  to  the 
President  and  used  as  a  basis  for  congressional  appor- 
tionment as  well  as  for  governmental  funding 
allocations. 

4.  We  urge  that  further  research  be  conducted  into  the 
feasibility  of  deriving  estimates  of  the  undercount  for 
other  ethnic  groups,  such  as  Hispanics.  While  such 
research  is  underway,  we  recommend  that  the  national 
undercount  rates  for  blacks  be  used  for  Hispanics  in 
order  to  permit  subnational  estimates  of  the  under- 
count to  be  derived  for  Hispanics  based  on  the 
synthetic  method.3 

3  We  feel  that  the  imputation  method  is  one  of  the  most  viable 
procedures  for  deriving  estimates  of  the  undercount  for  nonblack 
minority  groups— especially  Hispanics  and  Asian  Americans.  In  fact, 
estimates  of  the  undercount  for  each  of  these  groups  could  be  derived 
at  the  national  level  by  aggregating  the  figures  relating  to  the  percent 
of  each  of  these  groups  that  were  imputed  at  the  local  level  by  sex 
and  age.  Then  the  synthetic  method  could  be  used  to  derive  estimates 
of  the  undercount  for  each  of  these  groups  for  various  units  below  the 
national  level. 
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Comments 


Joseph  Waksberg 

Westat 


Although  the  adjustment  of  census  counts  by  the  Census 
Bureau's  Director  might  set  a  precedent  that  may  later  be 
regretted,  the  National  Academy  of  Sciences'  panel  has 
concluded  that  an  adjustment  could  be  made  with  only 
minor  danger  of  having  the  figures  manipulated  for  political 
purposes  or  having  them  distrusted  and  repudiated  by  the 
public.  The  key  to  this  is  an  early  decision  and  announce- 
ment of  the  procedure  to  be  followed  in  adjusting  the  data. 
The  procedure  should  be  described  in  sufficient  detail  so  as 
to  leave  no  possibility  of  the  perception  that  the  figures  were 
"manipulated"  after  the  census  counts  were  seen.  As  a 
consequence,  the  arguments  for  adjustment  seem  convincing. 
However,  there  is  still  a  question  of  how  to  adjust. 

Synthetic  estimation  is  the  best  way  of  proceeding.  In 
arguing  for  this  procedure,  it  was  not  necessary  for  Dr.  Hill 
to  assume  that  undercoverage  rates  are  equivalent  across 
locales,  however.  For  the  procedures  to  be  desirable,  it  is  not 
necessary  for  the  undercoverage  rates  to  be  equivalent,  only 
that  acting  as  if  they  are  equivalent  would  produce  a  fairer 
distribution  of  funds.  Although  Dr.  Hill  proposed  using 
national  estimates  for  race,  sex,  and  age  for  the  synthetic 
estimates,  it  is  possible  to  base  synthetic  adjustment  on  other 
kinds  of  areas  (regions,  States)  or  other  kinds  of  subgroups, 
if  acceptably  accurate  estimates  of  undercoverage  of  the 
subgroups  were  available. 

It  is  useful  to  examine  the  criteria  leading  to  Dr.  Hill's 
preference  for  a  synthetic  method  of  adjustment— namely, 
internal  consistency,  simplicity,  timeliness,  flexibility, 
equity,  and  reliability.  These  are  important  attributes  for  any 
adjustment  method  (although  perhaps  equity  and  reliability 
are  not  separable),  but  two  other  criteria  should  be  added: 
(1 )  The  adjustment  method  should  have  a  high  probability  of 
public  acceptance  (giving  further  emphasis  to  the  idea  of 
simplicity),  and  (2)  the  adjustment  should  be  nontrivial.  This 
last  criterion  is  related  to  whether  an  adjustment  should  be 
made  at  all.  Cutoff  values  for  the  undercount  rate  below 
which  no  adjustment  would  be  made  should  be  established  for 
the  overall  undercount  rate  and/or  the  undercount  rates  for 
subgroups.  An  adjustment  should  not  be  made  unless  it 
makes  an  important  difference  in  some  areas  at  least. 

Equity  is  central  to  the  issue  of  adjustment.  The  purpose 
of  adjustment  is  to  improve  equity  (statistical,  not  political), 
and  the  Census  Bureau's  role  is  to  see  to  it  that  the  intent  of 
legislation  or  executive  directives  is  carried  out  as  faithfully 
as  possible. 

While  the  Canadian  procedure  deals  with  estimates  of 
undercount    based    on    sample    surveys,    not   demographic 


analysis,  and  is  not  immediately  applicable  to  the  types  of 
synthetic  estimates  proposed  by  Dr.  Hill,  the  principle  of 
establishing  a  criterion  based  on  overall  performance  of 
technique  is  important  and  transferable.  Accepting  the 
mean-squared  error  procedure  or  similar  measures  implies 
that  the  Bureau  should  not  be  distracted  by  the  fact  that 
errors  in  some  localities  may  be  increased  by  an  adjustment 
technique.  It  is  unrealistic  to  require  improvement  in  all 
areas.  Furthermore,  even  the  fact  that  there  may  be  some 
classes  of  areas  that  are  adversely  affected  in  a  similar  way 
should  not  be  a  deterrent  to  accepting  an  adjustment 
technique  if  it  improves  overall  equity,  that  is,  reduces  the 
measure  of  inequity. 

If  one  assumes  the  national  demographic  estimates  of 
undercount  are  virtually  without  error  (which  is  reasonable 
except  for  Hispanics),  tnen  it  looks  like  the  use  of  synthetic 
estimates  will  reduce  inequity  unless  there  is  a  very  strange 
distribution  among  areas  of  undercoverage  rates  within 
age-sex  groups,  which  seems  unlikely  to  occur  in  practice. 

Dr.  Hill's  preference  for  use  of  national  estimates,  rather 
than  regional  or  State  estimates,  for  use  in  synthetic 
estimation  is  based  on  the  criteria  of  timeliness  and 
simplicity.  These  are  important  attributes  and  should  be 
abandoned  only  if  there  is  clear  and  definitive  evidence  that 
the  Census  Bureau  could  produce  sufficiently  reliable  sub- 
national  estimates  to  compensate  for  the  loss  of  time  and 
simplicity.  The  careful  itemization  of  assumptions  necessary 
to  produce  Mr.  Siegel's  State  demographic  estimates  made 
quite  clear  the  uncertainty  of  State  estimates.  Similarly,  past 
U.S.  experience  with  dual-measurement  systems  based  on 
surveys  or  reverse  record  checks  is  not  very  encouraging. 

Because  of  the  uncertainty  at  this  time  in  the  way  the 
Bureau  would  produce  State  estimates  and  the  fact  that  the 
methodology  is  still  quasi-experimental,  fairly  firm  rules  of 
how  State  estimates  would  be  derived  could  probably  not  be 
established  in  advance,  as  recommended  by  Dr.  Keyfitz.  This 
ability  to  state  rather  precisely  in  advance  the  methodology 
to  be  used  is  a  crucial  requirement  and  seems  to  be  another 
reason  to  prefer  Dr.  Hill's  approach,  rather  than  other 
alternatives. 

The  synthetic  estimate  proposed  by  Dr.  Hill  is  based  on 
race-sex-age  undercount  rates.  Although  sex  and  age  do  not 
affect  the  results  significantly,  and  while  using  these 
characteristics  goes  against  the  simplicity  criterion,  an 
adjustment  using  race-sex-age  rates  is  preferable.  The 
procedure  to  be  used  will  have  to  be  clearly  explained  to  the 
public;  if  it  is  well  known  that  fairly  accurate  estimates  of 
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undercount  exist  for  sex  and  age,  it  would  be  difficult 
to  explain  why  cruder  classifications  are  used  than  are 
available. 

There  are  two  points  to  consider  in  the  decision  about 
whether  to  adjust  through  imputation  or  through  other 
methods:  (1 )  The  technical  issue  of  what  is  the  easy  way  and 
what  is  the  best  way  to  adjust,  and  (2)  the  kinds  of  effects 
the  adjustment  methods  would  have  on  the  statistics  that 
would  be  used  in  the  formulas. 

In  considering  two  ways  of  adjusting  the  undercount— 
(1 )  adjusting  the  population  counts  only,  leaving  the 
characteristics    unchanged    and    (2)    actually    imputed    by 


duplicating  persons  at  random— which  of  the  two  is  chosen  is 
likely  to  make  a  significant  difference  in  the  per  capita 
income  figures.  The  lower  per  capita  income  of  the  black 
population  will  be  reflected  in  different  ways  for  the  two 
procedures  in  the  per  capita  income  figures  produced  for 
cities.  States,  etc.  This  should  be  considered  as  part  of  the 
issue  of  how  to  adjust. 

In  closing,  all  of  the  discussion  thus  far  assumes  that  it 
will  be  the  Census  Bureau  estimates  of  undercount  that  will 
be  used  for  adjustment.  This  is  a  reasonable  approach.  No 
one  has  questioned  the  Bureau's  competence  in  this  area,  nor 
its  objectivity  or  integrity. 


FloorDiscussion 


Four  questions  concerning  an  adjustment  were  raised  to 
consider:  (1)  Should  an  adjustment  be  made?  (2)  What 
agency  should  make  the  adjustments?  (3)  How  is  knowledge 
about  undercount  to  be  obtained?  and  (4)  How  should  an 
adjustment  be  made?  The  group  should  express  its  opinion 
on  the  questions  of  whether  or  not  to  adjust.  Discussion  first 
concentrated  on  question  2,  and  it  was  generally  agreed  that 
any  adjustments  should  be  made  by  the  Census  Bureau. 
Knowledge  of  the  undercount  must  be  obtained  outside  the 
Bureau  because  the  Bureau  cannot  do  much  better  in 
reducing  the  undercount  with  any  expenditure  of  money. 
Therefore,  the  methods  must  tend  toward  demographic 
analysis;  a  postenumeration  survey,  in  which  missing  persons 
would  not  be  found  in  any  large  numbers;  or  an  adminis- 
trative records  match.  Thus,  it  will  be  very  hard  to  obtain 
data  and  any  information  obtained  would  be  subjective.  The 
Census  Bureau  should  decide  on  a  method  of  adjustment.  It 
would  be  a  mistake  to  freeze  the  method  of  estimation  to, 
say,  synthetic,  as  better  methods  will  be  developed. 

A  distinction  made  between  the  short-term  and  long-term 
objectives  was  suggested,  however.  Short  term  means  what 
will  be  done  in  the  1980  census,  while  long  term  means  1990 
and  beyond.  If  the  view  of  the  National  Academy  of  Sciences 
is  taken  that  it  is  necessary  to  describe  the  procedure  in 
advance  of  the  census  counts,  then  there  is  not  much  time 
for  a  decision  to  be  made— 3  to  4  months. 

It  was  made  clear,  however,  that  the  Census  Bureau  was 
not  endorsing  the  use  of  a  synthetic  method  by  using  it  as  an 
illustration  to  judge  the  kinds  of  impacts  that  implementing 
various  kinds  of  public  programs  would  have.  In  particular, 
the  National  Commission  on  Employment  and  Unemploy- 
ment Statistics  never  endorsed  the  synthetic  method, 
although  it  did  call  for  adjustment  of  the  CPS  for  purposes  of 
measuring  unemployment.  While  a  synthetic  method  may  be 
used  to  measure  the  heart  and  substance  of  a  broad 
phenomenon  like  health,  there  are  rare  instances  where  it  was 
used  to  measure  error. 

The  discussion  seems  limited  to  simple  schedules  for 
making  an  adjustment;  the  possibility  of  using  synthetic 
estimation  when  there  are  data  available  for  that,  about 
1981,  then  implementing  a  broader  technique  and  data  in 
1983    or    1984    to    make    another    adjustment    should    be 


considered.  There  is  no  evidence  that  public  acceptance  and 
understanding  of  the  methodology  are  needed. 

It  was  also  argued  that,  given  the  constraints  of  time,  the 
Census  Bureau  may  have  to  use  the  most  simple  method.  As 
soon  as  the  postcensal  estimates  are  published,  the  counts  for 
198C  have  become  irrelevant.  For  published  postcensal 
estimates,  the  Census  Bureau  uses  a  series  of  estimates  with 
lists  of  assumptions  that  all  seem  to  be  most  feasible  and 
sensible.  In  that  situation,  the  difficulties  with  using  rates  of 
undercount  seem  much  less  problematic  relative  to  the 
method  of  estimation  being  used,  when  compared  with  the 
problems  of  using  undercount  adjustment  relative  to  the 
census  counts.  Of  course,  the  adjustments  should  be 
incorporated  in  the  postcensal  estimates.  The  objection  to 
adjustment  because  there  would  be  two  sets  of  census  figures 
created  would  not  exist  if  the  adjustment  is  only  made  to  the 
postcensal  estimates.  Also,  while  the  main  objection  to 
imputation  is  one  of  constitutionality,  it  also  removes  the 
urgency  of  trying  to  get  all  individuals  to  return  census 
questionnaires  and  the  Census  Bureau  should  be  praised  for 
backing  away  from  making  more  and  more  imputations. 

Several  additional  points  were  stressed: 

1.  Estimates  of  the  Hispanic  population  range  from  12 
million  to  20  million  persons.  There  should  be  an 
adjustment  to  Hispanic  counts  because  the  language 
problems  and  undocumented  workers  may  cause  large 
undercounts.  Use  of  the  black  undercount  rates  to 
adjust  for  Hispanics  is  the  best  method  currently 
available. 

2.  The  undercount  for  demographic  subgroups  should  not 
be  assumed  to  be  the  same  in  all  locales,  this  would  be 
counterintuitive.  In  fact,  an  important  objective  of  the 
Census  Bureau's  evaluation  should  be  to  look  at 
variability  in  rates  of  undercounts  for  race-sex-age 
subgroups  in  different  situations. 

3.  The  ratio  of  undercounts  to  imputations  may  vary  and 
more  research  should  be  done  into  the  variability  of 
imputation  and  overall  undercount  by  different  race 
groups.  This  may  yield  some  way  of  projecting  what 
the  variability  will  be  in  1980  for  use  as  a  provisional 
kind    of    measure. 
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BACKGROUND 

In  1962  the  U.S.  Supreme  Court  opened  the  reappor- 
tionment process  to  the  scrutiny  of  the  Federal  courts 
[Baker  v.  Can,  369  U.S.  186  (1962)].  The  Court  had 
previously  denied  relief  in  cases  challenging  redistricting 
plans.  In  1946  Justice  Frankfurter  argued  that  "courts  ought 
not  to  enter  this  political  thicket"  [Colegrove  v.  Green, 
328  U.S.  549  (1946)] .  By  1964  the  mood  of  the  Court  had 
changed  and  in  the  landmark  cases  of  Wesberry  v.  Sanders, 
376  U.S.  1  (1964),  and  Reynolds  v.  Sims,  377  U.S.  533, 
reh.  den.  379  U.S.  870  (1964),  the  Supreme  Court  entered 
the  "political  thicket."  The  nature  of  the  Court's  rulings  in 
these  cases  was  such  that  the  decennial  census  could  even- 
tually become  a  major  factor  in  reapportionment  and 
redistricting. 

The  rulings  in  both  of  these  cases  rested  on  the  issue  of 
equal  representation.  Prior  to  these  rulings,  and  the  many 
cases  that  followed,  many  States  had  not  reapportioned 
despite  population  changes.  Only  about  one-half  of  the 
States  reapportioned  their  legislatures  after  the  1950  census 
.and  many  had  not  reapportioned  for  decades  before  [4] .  The 
Supreme  Court  ruled  in  Wesberry  v.  Sanders  that  Article  I, 
section  2,  of  the  Constitution  required  that  "as  nearly  as  is 
practicable  one  man's  vote  in  a  congressional  election  is  to  be 
worth  as  much  as  another's." 

In  Reynolds  v.  Sims,  the  Court  held  that: 

The  Equal  Protection  clause  requires  that  a  State  make  an 
honest  and  good  faith  effort  to  construct  districts,  in  both 
houses  of  its  legislature,  as  nearly  of  equal  population  as 
is  practicable. 

Both  the  Congress  and  State  legislature  were  now  firmly 
tied  to  a  population  standard. 

In  cases  that  followed,  the  Court  refined  the  population 
standard.  The  Court  interpreted  "as  nearly  of  equal  popula- 
tion as  is  practicable"  as  near  mathematical  equity  for 
congressional  districts;  less  than  1  percent  is  the  goal,  and  up 
to  a  10-percent  deviation  among  State  legislative  districts. 
The  20-plus  major  Supreme  Court  cases  that  followed 
Baker  v.  Can,  369,  U.S.  186  (1962),  dealt  with  such  topics 
as  the  difference  between  legislative  and  congressional 
districts,  how  strict  the  population  equality  standard  should 
be  in  various  circumstances,  and  how  much  flexibility  States 
should  be  allowed  in  drawing  districts.  Populations,  average 
populations,  population  variances,  population  differentials, 


population  deviations,  and  population  ratios  are  repeatedly 
discussed  in  the  cases,  analyses,  articles,  and  textbooks  that 
deal  with  reapportionment  and  redistricting.  Court  cases 
commonly  discussed  the  population  variance  or  differential 
of  proposed  districting  plans,  comparing  calculations  of 
overall  population  variance  and  rationale  with  court-set 
standards  for  maximum  deviation.  Texts  on  the  subject 
regularly  contain  a  State-by-State  analysis  of  district  sizes 
and  average  and/or  maximum  percentage  deviation  in 
population  per  seat. 

There  has  been  relatively  little  discussion  of  the  role  of 
the  census  in  this  process.  In  Professor  Robert  Dixon's 
1968  work.  Democratic  Representation ,  Reapportionment  in 
Law  and  Politics,  the  census  is  mentioned  as  part  of  a  num- 
bers game  where  plaintiffs  use  census  data  to  demonstrate 
"numerical  disparities"  [2] .  Good  government  groups  rely 
extensively  on  population  statistics  to  "prove"  that  their 
proposals  for  "independent  reapportionment  commissions" 
are  needed  [8] . 

The  most  common  discussions  of  reapportionment  and 
redistricting  focus  on  questions  of  legal  rulings  and  court 
intent.  State  legislatures  and  their  staffs  are  even  now 
pondering  the  question,  what  will  the  courts  require  or 
permit  in  the  construction  of  legislative  and  congressional 
districting?  Speculation  revolves  around  how  strictly  the 
courts  will  enforce  the  population  equality  rule  and  how 
much  population  variance,  maximum  and  average,  will  be 
tolerated  [5]  .  Population  data  are  crucial  to  these  discussions, 
but  the  census  and  its  accuracy  have  not  been  raised  as 
issues. 

The  Census  of  Reapportionment 

Where  does  the  census  emerge  as  an  issue?  As  we  get 
closer  to  the  taking  of  the  census  in  April  1980,  the  Bureau's 
estimates  and  projections  have  generated  interest  in  both  the 
taking  of  the  census  and  its  use  in  reapportionment/redis- 
tricting.  In  July  of  1979,  the  Bureau  issued  a  set  of 
population  estimates  for  1978  and  the  "reapportionment" 
of  Congress  that  would  result  if  their  figures  were  used  as 
the  apportionment  base. 

According  to  Article  I,  section  2  of  the  U.S.  Constitution: 

Representatives  and  direct  Taxes  shall  be  apportioned 
among  the  several  States  which  may  be  included  in  this 
Union,  according  to  their  respective  numbers  ....  The 
actual  Enumeration  shall  be  made  within  three  Years 
after   the   first    Meeting  of  the  Congress  of  the  United 
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States,  and  within  every  subsequent  Term  of  ten  Years, 
in  such  Manner  as  they  shall  by  Law  direct. 

Under  the  latest  provisions  of  title  13  of  the  United  States 
Code,  the  Secretary  of  Commerce  is  directed  to  take  a 
census  of  population  as  of  April  1st  of  every  10th  year, 
starting  in  1980,  and  to  report  the  results  of  this  "decennial 
census"  to  the  President  within  9  months  of  its  taking, 
for  use  in  apportionment  of  Representatives  in  Congress. 
Title  2  of  the  United  States  Code  specifies  the  procedure  to 
be  used  in  apportionment.  The  Census  Bureau  is  responsible 
for  the  preparation  of  a  report  including  the  population  of 
each  State  and 

the  number  of  Representatives  to  which  each  State  would 
be  entitled  under  an  apportionment  of  the  then  existing 
number  of  Representatives  by  the  methods  known  as  the 
method  of  equal  proportions,  no  State  to  receive  less  than 
one  member.1 

The    Bureau's   provisional    estimates   of   population   and 

congressional  districts  were  presented  in  a  news  release  titled, 

"New    Census    Population     Estimates     Indicate    Extensive 

Congressional   Redistricting  After  1980"  [13].  Despite  the 

warnings  that   "redistricting  "    (i.e.,  the  actual  drawing  of 

new  district  boundaries)  cannot  be  forecast  with  certainty, 

the  press  and  the  political  world  found  the  news  irresistible. 

As  the  Bureau  pointed  out,  the  actual  redistricting  is  very 

sensitive  to  the  detail  and  accuracy  of  the  population  base, 

and  some  States  will  have  to  draw  new  district  lines  to  ensure 

that   all    districts   meet   court-set   standards  of   population 

equality,  but  the  projected  reapportionment  of  seats  among 

14  States  could  not  be  mitigated. 

The  Bureau's  estimates  showed  that  eight  States  would 

probably  gain  seats  in  the  U.S.  House  and  that  six  would  lose 

as  a  result  of  the  reapportionment  following  the  1980  census. 

The   reaction   to   such   an   announcement  was  predictable. 

Those  representing  States  predicted  to  gain  seats  were  elated, 

those  representing  the  losers  were  understandably  agitated. 

The  implications  are  more  than  just  personal  problems  or 

opportunities  for  current  members  of  Congress. 

In  1964,  in  a  packet  of  nine  cases  headed  by  Reynolds  v. 
Sims,  the  "one-man-one-vote"  rule  as  it  applied  to  legislative 
bodies  was  announced.  By  the  1966  election,  virtually  every 
congressional  and  legislative  seat  had  been  affected  by  the 
judicial  orders  for  new  districting  plans  [2]  .  Coincident  with 
the  Goldwater  presidential  defeat  of  1964  had  been  the  loss 
of  541  Republican-held  seats  in  State  legislatures  [2] .  After 
the  1964  elections,  Republicans  only  controlled  the  legisla- 
tures in  six  States  and  one  house  in  nine  [2]  . 

In  1979,  the  Democrats  are  still  in  control  of  more  than  30 
State  legislatures.  The  legislative  elections  of  1980  will  be 
the  last  chance  for  either  party  to  gain  control  before  the 


redrawing  of  congressional  district  boundaries  following  the 
1980  census.  Changes  in  the  number  of  congressional  seats  can 
signal  changes  which  could  impact  on  control  of  a  State's 
legislature.  Nine  of  the  Nation's  ten  most  populous  States 
will  probably  gain  or  lose  at  least  one  congressional  seat. 

The  Undercount  and  Reapportionment 

Normal  political  concerns  with  reapportionment  and 
redistricting  would  be  aggravated  by  a  census  undercount, 
its  measurement,  and  any  possible  adjustments.  Since  the 
publication  of  America's  Uncounted  People,  concern  regard- 
ing an  undercount  in  1980  has  focused  on  measuring  and 
adjusting  for  such  an  undercount.  Most  of  the  discussion  of 
the  impact  of  an  undercount  and  potential  adjustments  has 
dealt  with  the  apportionment  of  public  funds.  Relatively 
little  has  been  said  about  the  impact  of  an  undercount  on 
the  reapportionment/redistricting  process.  The  work  of  the 
Bureau  and  the  testimony  delivered  before  congressional 
committees  have  documented  in  some  detail  the  shifts  in 
the  distribution  in  funds  that  would  result  from  adjusting 
for  the  1970  census  undercount.  The  Bureau  estimates  that 
at  most  two  congressional  seats  and  four  States  could  have 
been  affected  by  adjustments  for  the  1970  undercount  [11] . 
The  Bureau's  analysis  of  the  impact  of  the  1970  undercount 
considers  the  possibility  of  shifts  in  State  legislative  districts 
and  city  council  districts  as  small  [9] . 

The  Bureau's  analysis  of  the  impact  of  an  undercount  and 
adjustment  on  representation  has  been  limited.  The  question 
of  legal  requirements  to  adjust  for  an  undercount  in  1980  has 
not  been  addressed.  The  Bureau  has  focused  on  adjustments 
to  the  census  for  use  in  the  many  Federal  programs  that 
involve  disbursement  of  funds  to  States  on  the  basis  of  popu- 
lation because  this  has  been  the  focus  of  recent  legislation.2 
If  the  census  is  adjusted,  how  could  a  more  accurate  set  of 
counts  be  ignored  when  congressional  seats  are  apportioned? 
If  an  undercount  is  certain  in  1980,  as  the  Bureau's  studies 
seem  to  indicate,  must  some  adjustments  be  made  if  the 
courts'  population  equality  standards  are  to  be  enforced? 
These  are  issues  that  have  not  been  directly  addressed  by  the 
courts  or  by  reapportionment  scholars.  The  impact  of  such 
issues  is  small,  given  the  Bureau's  focus  on  nationwide 
impact;  but  where  such  issues  are  important,  most  likely 
in  the  more  populous  States  and  in  urban  areas,  there  will 
be  court  challenges  and  there  will  be  concerned  State  and 
congressional  representatives. 

ANALYSIS  OF  ISSUES  IN 
REAPPORTIONMENT/REDISTRICTING 

A  census  is  defined  as  a  count  of  population.  It  is  gener- 
ally assumed  by  the  public  that  such  a  count  is  accurate,  and 


'This  number  was  specified  as  435  in  the  Apportionment  Act  of 
Aug.  8,  1911  (37  Stat.  13). 


2S.  1606,  introduced  by  Mr.  Moynihan  July  31 ,  1979,  directs  the 
Secretary  of  Commerce,  "In  conducting  the  census.  .  .(to)  adjust  the 
population  figures.  .  .to  correct  for  undercounting." 
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that  no  one  is  omitted  or  counted  twice.  This  assumption  is 
not  realistic.  In  every  census  there  is  probably  some  error  in 
the  enumeration.  The  Bureau's  analysis  of  the  1970  census 
count  has  provided  an  estimated  undercount.  Due  to  the  size 
of  that  undercount  and  the  characteristics  of  its  distribution, 
the  effect  of  that  undercount  fell  disproportionately  on 
those  in  heavily  urbanized  areas,  those  in  minority  ethnic 
groups,  those  in  poverty  areas,  and  those  who  represent  the 
aforementioned.  The  Bureau's  analysis  and  description  of 
which  groups  were  undercounted  was  clear  enough  for 
everyone  to  understand  and  use,  and  forceful  enough  to 
attract  the  attention  of  the  press  and  the  political  commu- 
nity. With  the  1980  count  coming,  people  seem  to  be  more 
aware  of  the  possibility  of  an  undercount. 

The  impact  of  an  undercount  is  something  that  is  not  as 
well  understood.  As  was  mentioned,  the  concentration  has 
been  on  the  impact  on  the  distribution  of  public  funds.  This 
is  understandable,  given  the  facts.  Funding  is  by  its  nature 
quantifiable  and  easily  subject  to  statistical  analysis  and 
adjustment  schemes.  The  distribution  of  funds  goes  on  con- 
tinuously, as  does  population  change,  and  funding  changes 
can  be  made  at  various  levels  and  in  small  units.  Apportion- 
ment is  discontinuous;  it  takes  place  at  only  a  few  levels, 
is  handled  in  rather  large  indivisible  blocks  (seats),  and  has 
been  subject  to  change  only  by  court  order. 

Apportionment  of  Seats 

Is  the  impact  of  an  undercount,  or  an  adjustment,  on 
reapportionment  significant?  In  1975,  Jacob  Siegel  of  the 
Population  Division  of  the  Bureau  of  the  Census  analyzed 
the  effect  of  an  undercount  on  representation  in  legislative 
bodies  at  various  levels  of  government  [9] .  Mr.  Siegel 
constructed  a  table  titled,  "Minimal  Model  Values  for  the 
Number  of  Contiguous  Legislative  Districts  and  the  Total 
Contiguous  Population  Required  to  Establish  an  Additional 
District,  According  to  the  Underenumeration  Rate  by  Race, 
the  Population  Distribution  by  Race,  and  the  Average 
Population  Per  District,"  which  allows  one  to  easily  estimate 
whether  or  not  a  geographic  area  would  qualify  for  an  addi- 
tional congressional  or  legislative  seat  as  a  result  of  a 
correction  of  the  census  undercount  [9] . 

On  a  nationwide  level  there  is  probably  little  impact  of  an 
undercount  on  representation,  according  to  Siegel's  table.  If 
the  general  methodology  used  to  produce  this  table  is  applied 
to  a  specific  area,  the  results  may  be  significant.  The  signifi- 
cance of  such  results  will,  of  course,  be  increased  if  the 
individual  interpreting  those  results  has  a  special  interest  in 
that  area  and  makes  some  assumptions  regarding  under- 
enumeration rates  in  that  area. 

Using  New  York  City  as  an  example,  Siegel's  table  would 
indicate  that  an  adjustment  for  the  undercount  in  1970 
would  not  produce  additional  representation.  Using  New 
York  City's  1970  population  of  7.894  million  and  its  black 
population  as  approximately  22  percent  does  not  produce 


another  congressional  seat  to  add  to  the  city's  existing  18. 
According  to  the  table,  with  a  25-percent  black  population, 
a  combination  of  28  contiguous  districts  and  a  population 
of  13.141  million  would  be  required  to  produce  an  addi- 
tional district,  if  average  underenumeration  rates  are  applied 
(1.9  percent  for  whites  and  7.7  percent  for  blacks).  If  high 
underenumeration  rates  are  applied  (2.2  percent  for  whites 
and  9.2  percent  for  blacks),  the  city  would  need  24  contig- 
uous districts  containing  a  population  of  11.020  million  to 
yield  another  congressional  district. 

Applying  a  simple  assumption  and  some  additional  infor- 
mation to  this  table  produces  a  different  result.  New  York 
City  has  large  populations  of  minorities  other  than  blacks. 
The  Hispanic  population  is  the  largest  minority  population 
next  to  the  black  population.  The  representatives  of  the 
Hispanic  community  have  speculated  that  the  under- 
enumeration rate  of  Hispanics  is  equal  to  or  greater  than  that 
estimated  for  blacks.  The  Bureau  has  estimated  that  "the 
coverage  level  of  the  Hispanic  population  in  1970  falls 
between  that  of  the  white  and  black  populations"  [12] . 

Assuming  that  all  groups  other  than  whites  were  under- 
enumerated  at  a  rate  equal  to  the  underenumeration  of 
blacks,  an  additional  congressional  seat  is  produced  for  New 
York  City,  a  result  consistent  with  the  city's  own  estimates 
of  the  1970  undercount.  With  a  population  32  percent  other 
than  white,  the  high  underenumeration  rate  portion  of 
Siegel's  table  would  cause  one  to  speculate  that  an  additional 
congressional  seat  for  New  York  City  would  be  possible. 
If  a  calculation  is  made  on  a  7.894  million  population  with 
a  32-  and  68-percent  nonwhite/white  split,  at  the  high 
underenumeration  rates  the  result  is  an  undercount  of 
approximately  351,000.  Allowing  for  variations  in  the  size 
of  districts  and  the  priorities  produced  under  the  method 
of  "equal  proportions"  used  to  determine  the  number  of 
Congressmen  apportioned  to  each  State,  351,000  would  very 
likely  produce  another  congressional  seat.3  Given  the  popu- 
lation ratios  of  New  York  State's  Senate  and  Assembly 
seats  (304,023  and  121,609,  respectively),  at  least  three 
additional  legislative  seats  could  be  allocated  to  New  York 
City. 

These  calculations  are  presented  as  the  type  of  analysis 
the  States  and  municipalities  will  develop  in  responding  to 
predictions  of  an  undercount.  The  Bureau's  estimate  that 
only  two  congressional  seats  would  be  shifted  (affecting  four 
States;  New  York  was  not  one)  and  that  the  impact  on  State 
legislative  representation  would  be  small  is  not  what  the 
individual  States  and  municipalities  will  project.  In  the  case 
of  State  legislatures,  the  statewide  population  count  as  well 
as  small  area  counts  can  affect  both  the  number  of  seats  in 


3  "Under  the  method  of  'equal  proportions,'  the  method  used  to 
determine  the  number  of  Congressmen  from  each  State  in  the  U.S. 
House  of  Representatives,  the  shift  in  the  population  of  a  State 
required  to  produce  a  change  in  the  State's  representation  may  be 
merely  a  few  hundred  persons  or  a  few  hundred  thousand  persons, 
depending  on  the  precise  populations  of  all  States."  (See  ref.  8,  p. 
106.) 
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a  house  and  the  drawing  of  district  boundaries.  In  New  York 
State,  the  size  of  the  Senate  is  not  fixed  and  can  vary  with 
population  changes.4  The  "block  on  border"  rule  in  New 
York  requires  that  the  population  of  each  individual  city 
block  be  checked  to  produce  the  most  precise  district 
population  equality,  which  greatly  affects  the  drawing  of 
State  legislative  districts. 

Technical  Problems 

The  Bureau's  analysis  of  the  undercount  and  its  impact 
on  representation  surfaces  a  number  of  technical  issues  that 
will  create  potential  problems.  While  those  who  develop 
legislation  can  simply  stipulate  that  the  Bureau  employ  the 
best  available  methodology  to  correct  for  undercounting, 
the  Bureau's  task  is  not  so  easy  and  the  result  is  unlikely  to 
be  universally  accepted  as  the  "best."  The  Bureau's  problem 
is  well  summarized  in  the  introduction  to  one  of  its  many 
technical  reports. 

Establishing  the  exact  or  even  the  approximate  extent 
of  underenumeration  is  much  more  difficult  than  dis- 
covering that  such  a  problem  exists  [10]. 

Subnational  estimates  of  undercount  and  resulting  adjust- 
ments are  likely  to  be  the  Bureau's  largest  problem.  While  the 
apportionment  of  Congress  could  be  carried  out  with  only 
national  and  State  population  counts,  the  drawing  of  district 
boundaries  requires  counts  for  small  geographic  areas.  A 
1-percent  standard  for  congressional  districts  of  500,000 
makes  a  small  area  of  5,000  persons  significant.  These  figures 
must,  of  course,  be  precise  (ranges  will  not  do)  and  agree 
with  State  totals.  State  representatives  will  assume  and 
expect  the  same  accuracy,  reliability,  and  confidence  the 
Bureau  indicates  for  its  national  and  State  totals.  While  the 
Bureau  appears  well  aware  of  the  public's  mistaken  presump- 
tion that  such  counts  and  adjustments  will  easily  follow  from 
the  methodology  for  national  estimates  coverage,  the  Bureau 
should  be  aware  that  State  laws  may  require  such  for  redis- 
ricting purposes.  An  example  of  such  need  is  New  York 
State's  rejection  of  the  use  of  estimates  for  small  areas  based 
on  sample  data.  In  New  York  State  in  1949,  the  Legislature's 
Reapportionment  Committee  contacted  the  Bureau  regarding 
a  method  of  taking  a  census  of  citizen  population  of  the 
State.  The  Bureau  offered  to  determine  citizen  population 
based  on  a  20-percent  sample,  but  the  proposal  was  rejected 
and  an  actual  census  of  citizens,  with  city  and  town  block 
counts,  was  contracted  for  and  delivered  in  1951  [3] . 

The  lack  of  uniformity  in  underenumeration  rates  will 
prohibit  the  blanket  use  of  such  rates  in  making  adjustments 
to  small  geographic  areas.  As  Siegel  correctly  pointed  out  in 
1974: 

Under  an  apportionment  formula,  if  the  apportionment  is 


4  New  York  State  Constitution,  Article  III,  section  4. 


based  entirely  or  primarily  on  population  and  if  the  rate 
of  underenumeration  is  the  same  from  area  to  area,  the 
results  of  such  apportionment  would  be  essentially 
unaffected  by  any  undercoverage  [12]. 

Mr.  Siegel  points  out  that  rates  are  not  uniform  and  are 
higher  for  minority  populations  and  probably  for  urban 
areas.  There  appears  to  be  a  general  awareness  of  this  lack  of 
uniformity  and  the  need  for  adjustments  to  be  tailored  to 
local  characteristics.  Without  this  tailoring,  adjustments 
would  be  useless  from  the  point  of  view  of  legislators  and  the 
courts  for  use  in  reapportionment. 

The  stability  of  estimates  of  underenumeration  and 
adjustments  is  likely  to  present  problems.  In  the  Bureau's 
efforts  to  develop  an  expected  "true"  population,  estimates 
of  coverage  have  been  issued  a  number  of  times.  Discussions 
of  various  techniques  and  updates  of  estimates  in  1974  dealt 
with  changes  of  up  to  ±0.1  percent  in  undercount  rates. 
While  this  is  a  relatively  small  amount,  especially  when 
dealing  with  a  population  of  over  200  million,  it  still  repre- 
sents 0.2  million  people.  In  1977,  the  Bureau  offered  alterna- 
tive estimates  of  net  underenumeration  at  the  regional  and 
State  levels  [11].  These  estimates  were  sufficiently  different 
to  produce  different  changes  in  representation  in  different 
States  when  the  data  were  applied  to  the  equal-proportions 
methodology.  One  series  of  estimates  produced  a  change  of 
one  seat  between  Tennessee  and  Oklahoma  and  another 
produced  changes  of  two  seats  involving  California,  Texas, 
Ohio,  and  Oklahoma.  While  this  change  may  be  considered 
small,  at  what  point  in  time  does  the  country  accept 
estimates  as  final  for  purposes  of  apportionment? 

Even  if  an  adjustment  of  the  1980  census  is  agreed  upon 
for  reapportionment  of  Congress  at  the  national  level,  and  it 
stands  in  the  courts,  the  ability  of  the  Bureau  to  produce 
counts  in  sufficient  detail  to  meet  the  technical  requirements 
for  drawing  districts  is  in  doubt.  The  question  of  whether 
data  could  be  produced  to  meet  the  legally  required  technical 
aspects  of  State  legislative  redistricting  can  only  be  answered 
by  surveying  the  individual  States.  My  reading  of  summaries 
of  constitutional  requirements  of  various  States  shows  no 
State  requiring  block  level  data,  but  many  States  have  strict 
equality  and  timing  requirements.  Reviewing  less  than  a  half 
dozen  constitutions  directly  showed  that  only  New  York 
required  census  counts  on  the  block  level,  which  would  be 
an  impossibility  given  the  present  state  of  the  art  in 
producing  adjustments. 

Court  Cases 

In  analyzing  questions  of  reapportionment  and  redis- 
tricting, the  discussion  has  depended  on  court  cases  and 
population  statistics.  Since  there  is  no  body  of  law  dealing 
specifically  with  the  need  for  or  use  of  estimates  or  adjust- 
ments by  the  Bureau  in  congressional  apportionment,  I  have 
looked  for  related  cases  that  might  set  a  precedent  or  be 
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indicative  of  the  courts'  leanings.  Cases  were  selected  which 
dealt  with  the  questions: 

Is  it  legally  necessary  to  have  an  adjustment  if  an  under- 
count  is  certain? 

Would  the  courts  require  new  apportionment  or  redis- 
ricting plans  if  an  adjustment  was  available? 

Three  cases  were  found  that  dealt  indirectly  with  these 
questions.  While  I  feel  there  may  be  other  cases  that  I  have 
not  located  that  also  deal  indirectly  with  these  questions,  the 
three  cases  presented  probably  offer  as  much  insight  into  the 
possible  actions  of  the  courts  in  the  eighties  as  will  any 
others. 

The  three  cases  that  follow  all  tend  to  discount  the  value 
of  census  adjustments  and  refinements  of  population  counts 
in  the  reapportionment  process.  In  the  case  of  Asbury  Park 
Press,  Inc.  v.  Woolley,  33  N.J.  1  (1960),  the  New  Jersey 
Supreme  Court  heard  a  taxpayer's  action  seeking  declaratory 
judgment  that  the  1941  New  Jersey  apportionment  law  was 
unconstitutional  in  light  of  1950  census  figures.  The  court 
ruled  that  since  the  New  Jersey  1941  apportionment  had  not 
been  challenged  in  8  years  and  since  the  1960  census  figures 
would  be  available  before  the  1961  election,  the  court  would 
defer  declaration  to  allow  the  legislature  to  reapportion 
itself.  In  addition,  the  court  stated  that  the  results  of  pre- 
liminary counts  customarily  released  by  the  Census  Bureau 
would  be  sufficiently  accurate  for  the  New  Jersey  General 
Assembly  to  proceed  to  redistrict  in  an  intelligent  manner, 
provided  the  counts  included  data  broken  down  by  counties, 
towns,  and  wards.  The  New  Jersey  court  cited  the  earlier 
Connecticut  case  Cahill  v.  Leopold,  141  Conn.  1  (1954),  in 
which  the  Connecticut  Supreme  Court  ruled  that: 

It  is  not  necessary  that  the  information  be  published  in 
book  form  before  it  becomes  officially  available.  Indeed, 
there  is  not  even  a  constitutional  provision  requiring  the 
figures  to  be  final.  While  final  tabulations  tend  to  greater 
exactitude  than  those  previously  computed,  there  is  no 
need  for  the  precision  of  perfection.  The  results  of  the 
preliminary  counts  customarily  released  by  the  Census 
Bureau,  as  happened  in  the  case  at  bar,  are  ample  to 
afford  sufficiently  accurate  data  for  an  Assembly  to  pro- 
ceed to  redistrict  in  an  intelligent  manner,  provided  the 
counts  have  been  broken  down  into  counties,  towns,  and 
wards.  (103  A.  2d,  at  pages  823-824) 

In  the  case  of  Koziol  v.  Burkhardt,  51  N.J.  412  (1968), 
the  court  ruled  that  the  legislature  was  not  required  to  use  a 
1967  population  estimate  to  make  a  1968  amendment  to  a 
1966  redistricting  act  based  on  1960  census  data.  The  court 
ruled  the  legislature  was  free  to  use  the  1960  census  on 
which  the  act  was  based,  but  did  not  consider  whether  the 
legislature  could  use  such  interim  estimates. 

In  the  session  of  November  8,  1976,  the  U.S.  Supreme 


Court  affirmed  the  judgment  of  the  U.S.  District  Court  in 
the  case  of  Republican  Party  of  Shelby  County,  Tenn.  v. 
Dixon,  USDC  W.  Tenn,  3/25/76.  In  a  summary  action  the 
Supreme  Court  ruled: 

In  determining  validity  of  congressional  districting  Federal 
district  court  is  not  confined  as  matter  of  law  to  1970 
Federal  census  figures,  but  may  consider  reliable  popula- 
tion estimates  made  since  then;  Federal  decennial  census 
figures  will,  however,  be  controlling,  unless  there  is  "clear, 
cogent,  and  convincing  evidence"  that  they  are  no  longer 
valid  and  that  other  figures  are  valid;  neither  post-1970 
Census  population  figures  prepared  by  National  Planning 
Data  Corporation  nor  "provisional  estimates"  of  Census 
Bureau  meet  such  test;  reapportionment  of  sixth,  seventh, 
and  eighth  congressional  districts  of  Tennessee  is  ordered 
on  basis  of  1970  Federal  census  figures  [7] . 

If  I  can  be  allowed  to  interpret  these  cases  and  produce 
some  general  conclusions,  they  would  be  as  follows: 

1.  An  otherwise  valid  district  plan  need  not  be  invalidated 
because  more  recent  population  data  are  available. 

Asbury  Park  Press,  Inc.  v.  Woolley, 
33  N.J.  1  (1960) 

This  is  contingent  on  the  assumption  that  a  new 
districting  plan  will  be  generated  as  a  matter  of  course 
when   the    next   decennial    census   becomes  available. 

2.  Decennial  census  counts  for  use  in  reapportionment 
need  not  be  those  eventually  published  as  the  final 
count. 

Asbury  Park  Press,  Inc.  v.  Woolley, 
supra. 

State  constitutions  don't  generally  specify  "final" 
census  counts,  accepting  instead  that  when  the  census 
becomes  "officially"  available  for  public  use,  it  is  ac- 
curate enough. 

3.  Population  counts  must  be  broken  down  in  municipal 
units  small  enough  to  meet  requirements  for  drawing 
of  district  boundaries. 

Asbury  Park  Press,  Inc.  v.  Woolley, 
supra. 

4.  Adjustments  to  redistricting  plans  based  on  census  data 
can  be  made  based  on  that  census  data  and  need  not 
be  based  on  more  recent  estimated  population  data. 

Koziol  v.  Burkhardt,  51  N.J.  412  (1968) 


150 


5.  In  determining  the  validity  of  a  districting  plan,  the 
court  is  not  confined  to  decennial  census  figures. 

Republican  Party  of  Shelby  County, 
Tenn.  v.  Dixon,  USDC  W.  Tenn., 
3/25/76 

6.  Decennial  census  figures  are  to  be  used  unless  clearly 
no  longer  valid,  and  better  figures  are  available. 

Republican  Party  of  Shelby  County, 
Tenn.  v.  Dixon,  supra. 

7.  Projections  from  a  decennial  base  and  the  Census 
Bureau's  "provisional  estimates"  of  population  are 
not  clearly  better  than  decennial  census  figures. 

Republican  Party  of  Shelby  County, 
Tenn.  v.  Dixon,  supra. 

To  summarize  these  conclusions,  I  would  say  that  the 
courts  would  not  require  adjustment,  or  redrawing,  of  a 
redistricting  plan  that  had  been  validated  by  the  court  or 
had  not  been  challenged  because  of  changes  in  population 
as  evidenced  by  the  types  of  estimates,  updates,  or  adjust- 
ments to  the  census  which  the  Bureau  currently  issues. 

If  an  undercount  is  certain,  but  it  is  agreed  that  it  cannot 
adequately  be  measured  or  adjusted,  the  courts  would  have 
no  choice  but  to  accept  the  census  as  taken.  If  there  is  a 
general  agreement  on  the  measurement  of  the  undercount, 
and  an  adjustment  for  use  in  distribution  of  funds  is  pro- 
duced and  accepted  by  the  Federal  and  State  governments, 
the  court  action  could  be  to  accept  such  figures  as  valid  and 
require  that  they  be  used  in  reapportionment  or  that  they 
evidence  violation  of  the  one-man-one-vote  principle  and 
invalidate  existing  plans. 

This  does  not  answer  the  questions  posed  in  the  search  for 
cases.  The  only  thing  that  is  certain  is  that  the  courts  have 
kept  their  options  open. 

CONCLUSIONS,  SPECULATION,  AND  THE 
MID-DECADE  CENSUS 

The  Census  Bureau  is  currently  considering  whether  or 
not  it  should  adjust  the  1980  census  count  by  allocation  of 
the  estimated  uncounted  population.  The  Bureau  is  also 
considering  the  methodology  to  be  used  and  the  extent  of 
the  adjustment.  Before  any  conclusions  regarding  the  impact 
of  an  adjustment  on  reapportionment  and  redistricting  can 
be  reached,  the  Bureau's  decision  must  be  considered. 

Despite  its  reliance  on  population  statistics,  the 
reapportionment/redistricting  process  is  essentially  legal  in 
nature.  Whether  this  is  due  to  the  environment  in  which  the 
process  takes  place  or  the  people  who  usually  carry  it  out,  it 
is  treated  as  a  legal  question  of  political  representation.  The 


efforts  of  government  reform  groups  to  change  the 
reapportionment/redistricting  process  are  always  through 
the  courts,  and  their  proposals  involve  changing  who  manages 
the  process  and  how.  No  one  has  proposed  that  the  process 
be  treated  as  a  completely  technical  one  and  be  assigned  to 
a  computer. 

The  Bureau  is  responding  to  the  concerns  it  now  hears 
expressed.  After  an  adjustment  is  made,  the  issue  will  not  be 
the  need  for  an  adjustment  but  the  result  and  its  impact.  If 
the  Bureau  decides  to  adjust  the  census,  it  will  bear  the 
burden  of  justifying  the  action  and  the  methodology.  If  the 
Bureau  decides  against  an  adjustment,  it  will  be  criticized, 
and  possibly  the  Congress  will  enact  legislation  requiring  an 
adjustment. 

In  the  case  of  the  mid-decade  census,  the  Congress  did 
act.  The  discussion  of  a  mid-decade  census  predates  the 
establishment  of  the  Census  Bureau  [1].  The  Bureau  is 
required  to  take  a  decennial  census  and  is  permitted  to 
produce  current  population  on  an  annual  basis.  The  Congress 
ended  discussion  in  1976  by  requiring  a  mid-decade  census. 
Accordingly,  the  Congress  produced  legislation  that  details 
the  intended  use  of  the  results  of  such  a  census. 

While  the  requirement  of  a  mid-decade  census  may  mean 
more  work  for  the  Bureau,  the  Congress  has  shouldered 
some  of  the  responsibility  for  its  conduct  and  its  results.  The 
legislation  is  interpreted  by  the  Bureau  as  "flexible,"  allow- 
ing the  Bureau  to  determine  methodology  [1].  This 
flexibility  also  allows  the  Congress  to  determine  the  scope  of 
the  mid-decade  census  both  by  direct  review  and  by  provi- 
sion of  funding.  The  scale  of  the  Bureau's  programs  (the 
current  proposal  is  for  a  limited  sample  survey)  will  be  set 
with  congressional  review  and  approval. 

The  Congress  has  also  specified  that: 

(IF)  in  the  administration  of  any  program  established  by 
or  under  Federal  Law  which  provides  benefits  to  State  or 
local  governments  or  to  other  recipients,  eligibility  for  or 
the  amount  of  such  benefits  would  be  determined  by 
taking  into  account  data  obtained  in  the  most  recent 
decennial  census,  and  (IF)  comparable  data  are  obtained 
in  a  mid-decade  census  conducted  after  each  decennial 
census,  then  in  the  determination  of  such  eligibility  or 
amount  of  benefits  the  most  recent  data  available  from 
either  the  mid-decade  or  decennial  census  shall  be 
used  [6] . 


But, 


Information  obtained  in  any  mid-decade  census  shall  not 
be  used  for  apportionment  of  Representatives  of  Congress 
among  the  several  States,  nor  shall  such  information  be 
used  in  prescribing  congressional  districts  [6]. 

The  result  of  these  congressional  stipulations  is  that  the 
Bureau   may  face  some  questioning  of  its  1985  population 
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counts  by  those  whose  general  revenue-sharing  funding  drops, 
but  the  court  challenges  regarding  congressional  district 
reapportionment  will  be  directed  at  Congress.  In  addition, 
while  this  clause  may  protect  Congress  from  a  1985  reapper- 
tionment  (and  since  reapportionment  is  a  fate  worse  than 
death,  I'm  sure  that  Congress  feels  it  will),  it  does  not  pro- 
hibit the  use  of  the  mid-decade  census  in  challenging  State 
legislative  district  plans.  Given  the  rate  at  which  our  Nation's 
population  is  changing,  there  will  certainly  be  challenges  to 
a  State  representation  plan  unless  this  prohibition  is 
extended,  but  the  lobbying  and  complaining  will  be  directed 
at  the  Congress,  not  the  Bureau.  If  this  prohibition  is  ex- 
panded, or  if  State  constitutions  specify  decennial  redis- 
ricting, or  if  the  Bureau  does  not  produce  data  for  small 
geographic  areas,  there  should  be  no  questioning  of  the 
Bureau's  1985  product.  If  States  are  subject  to  redistricting 
as  a  result  of  the  1985  count  and,  even  worse,  if  the  courts 
invalidated  this  prohibition,  then  the  Bureau  will  find  itself 
the  focus  of  questions  regarding  the  methodology  and 
validity  of  the  1985  counts. 

If  the  Bureau  decides  it  must  adjust  the  1980  count,  or 
the  Congress  mandates  an  adjustment  but  provides  no  details, 
the  mechanism  and  methodology  of  the  adjustment  will 
determine  whether  or  not  the  adjustment  will  impact  the 
reapportionment/redistricting  process.  If  it  is  possible  to 
estimate  the  undercoverage,  adjust  the  count,  and  incorpo- 
rate the  results  into  the  census  as  it  is  officially  released 
and  used  for  apportionment  of  Congress  and  Federal  aid, 
there  is  a  good  chance  that  the  adjustment  would  be  gener- 
ally accepted  and  upheld  by  the  courts.  Of  course,  the 
adjustment  would  have  to  apply  to  sub-State  counts  as  well 
as  the  national  and  State  totals. 

As  the  result  of  an  adjustment  creates  a  greater  gap 
between  the  figures  available  and  the  concept  of  a  single 
census  (issued  as  the  one  set  of  true  numbers),  the  amount 
of  challenge  will  grow.  In  the  allocation  of  funds  many 
factors  are  involved,  the  numbers  are  large,  comparisons 
are  vague,  and  Federal  funding  is  not  easily  characterized 
to  the  public  as  a  zero-sum  situation.  In  reapportionment 
those  involved  know  that  the  number  of  seats  are  limited, 
small,  and  for  someone  to  gain,  someone  must  lose.  If  the 
difference  between  the  adjusted  census  and  the  true  census 
is  reasonable  (which  means  that  it  appears  valid,  has  wide 
academic  support,  the  methodology  is  not  under  constant 
attack,  and  its  presentation  is  trouble-free,  and  this  applies 
at  all  levels  of  geography),  the  adjusted  count  should  be 
accepted  by  the  public  and  considered  valid  and  better  by 
the  courts  than  the  unadjusted  count.  When  the  gap  widens 
to  the  point  where  the  court  considers  the  adjusted  count 
invalid  or  no  better  than  the  unadjusted  count,  the  potential 
for  problems  disappears  as  the  unadjusted  count  again  be- 
comes controlling  for  the  purposes  of  reapportionment  and 
redistricting. 

The  greatest  challenge  to  the  use  of  an  adjusted  count  for 
reapportionment   purposes  would    result   if  the  courts  ac- 


cepted an  adjusted  count  as  valid,  but  the  public  considered 
the  adjustment  unreasonable.  Such  a  situation  seems  unlikely 
given  that  the  adjustment,  as  we  are  considering  it,  would  not 
be  congressionally  mandated  nor  required  for  apportionment 
of  Federal  aid,  making  it  unlikely  that  the  court  would 
accept  an  adjustment  as  valid  unless  it  was  generally  accepted 
by  the  public. 

Given  the  consideration  and  preparation  the  Census 
Bureau  has  put  into  its  decisionmaking  process,  it  seems 
unlikely  that  an  adjusted  count  would  be  accepted  as  valid 
by  the  courts  for  reapportionment  and  redistricting  purposes 
and  at  the  same  time  generate  more  problems  and  complaints 
than  a  census  which  is  known  to  be  an  undercount.  An 
adjustment  will  change  who  complains.  Different  States  and 
municipalities  will  be  affected  by  an  adjustment  than  those 
who  suffer  from  an  undercount.  The  magnitude  of  the 
problem  should  be  predictable  if  the  Bureau  has  confidence 
in  its  1980  estimates  and  the  methodology  it  plans  to  use. 

On  the  congressional  level,  the  number  of  seats  at  stake 
are  fixed.  While  there  may  be  a  change  in  which  States  lose 
and  which  gain,  I  would  speculate  that  the  currently  pre- 
dicted losers  will  simply  lose  a  little  less  and  gainers  gain  a 
little  less.  State  legislatures  will  also  be  in  a  trade-off 
situation,  with  potential  gains  and  losses  depressed. 
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The  points  that  Mr.  Carlucci  makes  about  the  impact  of 
adjustment  on  politicians  are  important.  To  carry  the  issue 
further,  from  the  State  legislative  to  the  local  level,  there  are 
over  500,000  elected  public  officials  that  are  chosen  by 
ballot  in  perhaps  100,000  districts.  The  impact  of  two  sets  of 
figures  on  this  group  would  be  great,  as  would  the  impact  of 
timing  (i.e.,  when,  after  the  census,  adjustments  are  made).  If 
the  census  is  indeed  adjusted,  can  a  more  accurate  count  be 
ignored  when  reapportioning  or  redistricting?  Can  this  issue 
be  resolved  by  passing  a  law? 

Whatever  is  decided  about  adjustment  will  inevitably  be 
challenged.  In  one  of  the  cases  cited,  Shelby  County  vs. 
Dixon,  it  was  held  that  redistricting  below  the  congressional 
level  is  not  to  be  confined  to  the  use  of  census  figures  alone. 
Reformers,  both  inside  and  outside  of  government,  will  use 
this  approach  with  emphasis  on  the  rights  of  groups  including 
minorities.  There  also  have  been  a  number  of  cases  dealing 
with  dilution  of  the  vote.  Under  section  5  of  the  Voting 
Rights  Act,  for  example,  plans  require  Federal  preclearance. 


and  the  State  and  local  governments  have  the  burden  of 
proving  that  any  change  is  nondiscriminatory.  Adjustments 
to  census  counts  certainly  will  affect  the  way  the  Justice 
Department  will  evaluate  plans.  With  reference  to  those  State 
and  local  governments  not  covered  by  the  Voting  Rights  Act, 
questions  are  being  raised  in  the  courts  as  to  whether  at-large 
elections  dilute  the  vote  or  are  discriminatory  against 
minorities  or  low  to  moderate  income  persons.  In  the 
Pasadena,  Calif.,  test  case,  a  possible  solution  might  be  to 
increase  the  number  of  districts,  and  this  will  in  turn  raise  the 
pressures  for  data. 

So  much  depends  on  the  population  base— that  is,  when, 
where,  and  how  the  figures  are  published.  The  legal  con- 
sequences of  adjustment  are  unclear,  but  the  political 
consequences  are  that  250,000  to  300,000  State  and  local 
officials  will  be  very  concerned.  Whatever  figures  are  used 
will  be  subject  to  further  change  at  the  next  census.  It  is 
obvious  that  the  political  uncertainties  of  the  adjustment 
issue  will  touch  every  local  government  unit. 
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BACKGROUND 

Before  explaining  what  has  been  done  in  Australia  with 
regard  to  underenumeration  in  the  census,  it  is  necessary  to 
give  some  background  on  Australia  and  its  political  system. 

Australia  has  six  States  and  two  Federal  Territories.  The 
Federal  parliamentary  system  is  modeled  very  closely  along 
the  lines  of  the  Westminster  system.  The  main  difference  is 
that  the  upper  house  (the  Senate)  is  democratically  elected, 
with  each  State  having  equal  representation.  Senators  are 
elected  for  a  6-year  term,  with  half  the  Senate  facing 
reelection  every  3  years.  The  Members  of  the  House  of 
Representatives  (the  lower  and  more  important  house)  are 
elected  for  a  3-year  term  on  a  "one-man-one-vote" 
principle.  In  addition  to  the  Federal  parliamentary  system, 
there  is  in  each  State  (generally)  a  similar  system  of  an  upper 
and  lower  house  of  Parliament. 

There  are  only  11  cities  of  100,000  people  or  more.  These 
account  for  70  percent  of  the  total  population  of  Australia 
but  less  than  1  percent  of  the  land  mass.  Western  Australia, 
while  containing  only  1.2  million  people,  is  roughly  30 
percent  of  the  land  area  of  the  United  States  (including 
Alaska  and  Hawaii).  There  are  approximately  900  elected 
local  government  authorities,  covering  areas  ranging  in 
population  from  a  few  hundred  to  700,000  people. 

The  Federal  Government  collects  all  personal  income  tax 
revenues,  which  account  for  about  half  of  the  total  Federal 
budget  receipts.  The  remainder  is  made  up  mainly  from 
company  taxes  and  indirect  taxes.  About  35  percent  of  the 
Federal  outlay  is  in  the  form  of  cash  benefits  to  persons 
(mostly  welfare  payments— age  pensions,  unemployment 
benefits,  etc.),  with  a  similar  amount  being  grants  to  State 
governments.  (This  is  discussed  in  detail  later.) 


These  grants  represent  about  60  percent  of  State  govern- 
ment income.  Grants  from  State  and  Federal  governments  to 
local  government  authorities  (LGA's)  account  for  about  20 
percent  of  local  government  receipts  and  about  2.5  percent 
of  State  government  outlays. 

This  potted  description  should  provide  sufficient  back- 
ground information  to  enable  an  understanding  of  the 
following  paper  and  why  adjustment  for  the  undercount  was 
undertaken  in  Australia. 

LEGISLATIVE  REQUIREMENTS  FOR 
POPULATION  RESULTS 

Population  statistics  are  required  for  a  variety  of  purposes, 
not  the  least  of  which  are  the  legislative  requirements.  In  this 
section,  I  will  discuss  the  need  for  population  statistics  for 
electoral  distributions  and  fund  allocation,  and  the  con- 
sequent legislative  requirement  for  a  census  every  5  years. 

Requirement  for  a  Quinquennial  Census 

The  "Census  and  Statistics  Act  of  1905"  as  amended  in 
1977  requires  that: 

"The  Census  shall  be  taken  in  the  year  1981  and  in  every 
fifth  year  thereafter,  and  at  such  other  times  as  are 
prescribed." 

Prior  to  this  being  included  in  the  act  in  1977,  the 
requirement  was  "every  tenth  year  or  at  such  other  time  as  is 
prescribed."  However,  a  census  has  been  conducted  quin- 
quennially  since  1961. 


Table  1.  State  Size,  Population,  and  Representation  for  Australia:  September  1978 


Federal  Members 

Area 

Population 

of  House 

of 

State 

(thousand  Km    ) 

(thousands) 

Representatives 

Senators 

Total 

7,681.8 

14,287.1 

125 

64 

New  South  Wales 

801.6 

5,028.3 

43 

10 

Victoria 

227.6 

3,825.8 

33 

10 

Queensland 

1,727.2 

2,171.7 

19 

10 

South  Australia 

984.0 

1,289.0 

11 

10 

Western  Australia 

2,525.0 

1,227.3 

11 

10 

Tasmania 

67.8 

414.4 

5 

10 

Northern  Territory 

1 ,346.2 

113.3 

1 

2 

Aust.  Capital  Territory 

2.4 

217.3 

2 

2 

157 


158 


Three  important  points  arise  out  of  this. 

1.  There  is  a  statutory  requirement  to  conduct  a  census 
every  5  years; 

2.  The  Australian  Bureau  of  Statistics  (ABS)  could  be 
required  to  conduct  a  census  more  frequently  than 
every  5  years;  and 

3.  The  requirement  is  under  the  Census  and  Statistics 
Act,  not  written  into  the  Constitution  or  into  a  more 
general  act. 

Electoral  Requirements  for  Population  Estimates 

The  main  incident  that  led  up  to  the  amendment  of  the  act 
in  1977  was  a  High  Court  decision  in  1976  that  certain 
apsects  of  the  Electoral  and  Representation  Acts  were  in- 
valid. These  were  concerned  with  the  determination  of  the 
number  of  electoral  seats  in  the  Federal  House  of  Repre- 
sentatives, and  the  decision  indicated  that  there  needed  to  be 
an  electoral  redistribution  within  the  life  of  every  Parliament, 
i.e.,  at  least  every  3  years.  The  determinations  of  State 
representation  are  based  on  the  "latest  statistics  of  the 
Commonwealth,"  which  have  as  their  basis  the  most  recent 
population  census  data.  In  view  of  this,  the  Government 
considered  it  necessary  to  require  the  ABS  to  conduct  a 
census  at  least  every  5  years. 

In  Australia,  population  estimates  are  used  to  determine 
the  number  of  Federal  electoral  divisions  (i.e.,  the  number  of 
Members  of  the  House  of  Representatives)  in  each  State.  The 
individual  electorate  boundaries  are,  however,  determined  on 
the  basis  of  the  number  of  registered  voters,  such  that  no 
electorate  may  differ  from  the  average— in  terms  of  number 
of  registered  voters— by  more  than  ±  10  percent.  Population 
is  not,  therefore,  an  important,  direct  element  in  determining 
the  geography  of  an  individual  electorate,  although  the 
number  of  registered  voters  is. 

The  primary  reason  the  census  is  required  quinquennially 
is,  however,  to  ensure  that  each  State's  representation 
reflects  up-to-date  population  numbers. 

Fund  Allocation  Requirement  for  Population 
Estimates 

As  mentioned  earlier,  a  considerable  portion  (about  60 
percent)  of  State  government  revenue  is  in  the  form  of 
disbursements  from  the  Federal  Government.  A  major 
portion  (slightly  less  than  half)  of  the  grants  are  in  the 
form  of  "Personal  Income  Tax  Sharing  Entitlements." 
The  Federal  Government  has  determined  that  a  specific 
proportion  of  net  (personal)  income  tax  revenue  be  allocated 
to  the  States  as  general  purpose  grants.  In  1976  and  1977, 
33.6  percent  of  net  personal  income  tax  revenues  were 
allocated,  and  39.8  percent  for  1978  and  1979. 

In  the  allocation  to  the  States,  each  State's  share  is 
determined  with  respect  to  an  "adjusted  population  figure," 


where  this  "adjusted  population"  is  defined  in  the  "States 
Personal  Income  Tax  Sharing  Act  of  1978"  (PITS)  as 


1. 


In  the  case  of  Victoria,  the  estimated  population  of 
that  State  on  31  December  in  that  year; 
In  the  case  of  any  other  State,  the  estimated  popu- 
lation  of  the   State   on   31    December   in   that  year 
multiplied  by 


•  in  New  South  Wales 

•  in  Queensland 

•  in  South  Australia 

•  in  Western  Australia 

•  in  Tasmania 


1.02740 
1.39085 
1.52676 
1.66516 
2.00188 


In  simple  terms,  per  head  of  population,  Tasmania  receives 
slightly  more  than  twice  as  much  as  Victoria  receives. 

There  is  nothing  in  the  act  to  define  what  "population" 
means,  except  the  act  also  contains: 

"Section  10.  A  determination  made  by  the  Commissioner 
under  Section  6,  or  a  determination  made  by  the 
Australian  Statistician  under  Section  9,  shall,  for  the 
purposes  of  this  Act,  be  conclusively  presumed  to  be 
correct." 

The  population  "multipliers"  in  the  legislation  area  product 
of  negotiation  between  the  Commonwealth  and  State  govern- 
ments and  are  subject  to  periodic  review.  Changes  in  relative 
allocations  can  be  effected  by  amending  legislation  in  the 
Commonwealth  Parliament.  The  link  between  population 
estimates  and  relative  allocations  is,  therefore,  by  no  means 
rigid. 

States  also  receive  money  to  be  reallocated  to  local 
government  authorities.  The  relevant  act,  "Local  Government 
(Personal  Income  Tax  Sharing)  Act  1976,"  specifies  the 
manner  in  which  the  money  is  to  be  allocated: 

A  State  shall 

(a)  allocate  not  less  than  30  per  centum  of  the  amount  to 
which  it  is  entitled  under  Section  5  in  respect  of  a 
year  amongst  local  governing  bodies  in  the  State  on  a 
population  basis,  that  is  to  say,  on  a  basis  that  takes 
into  account  the  respective  populations  of  those  local 
governing  bodies  and  may  take  into  account  the 
respective  sizes,  and  the  respective  population 
densities  of  the  areas  of  those  local  governing  bodies 
and  any  other  matters  agreed  upon  between  the  Prime 
Minister  and  the  Premier  of  the  State  as  being  relevant 
for  the  purposes  of  that  allocation;  and 

(b)  allocate  the  remainder  of  the  amount  amongst  local 
government  bodies  in  the  State  on  a  general  equaliza- 
tion basis,  this  is  to  say,  on  a  basis  that  has  the  object 
of  ensuring,  so  far  as  is  practicable,  that  each  of  those 
local  governing  bodies  is  able  to  function,  by  reason- 
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able  effort  at  a  standard  not  appreciably  below  the 
standards  of  the  other  local  governing  bodies  in  the 
State,  being  a  basis  that  takes  account  of  differences 
in  the  capacities  of  those  local  governing  bodies  to 
raise  revenue  and  differences  in  the  amounts  required 
to  be  expended  by  those  local  governing  bodies  in  the 
performance  of  their  functions. 

It  would  appear  that  the  legislation  allows  scope  for  an 
allocation  of  funds  among  local  governing  bodies  that  differs 
significantly  from  population  relativities. 

ADJUSTMENT  FOR  UNDERENUMERATION 

Following  the  production  of  the  first  preliminary  results 
from  the  1976  census,  it  was  realized  from  comparison  with 
postcensal  (1971)  estimates  that  the  census  had  missed 
significant  numbers  of  the  population.  The  postenumeration 
survey  (PES)  conducted  after  the  census  confirmed  this. 
After  considerable  analysis  of  other  demographic  informa- 
tion, it  was  decided  that  a  "better"  population  estimate 
would  be  obtained  if  an  adjustment  was  made  to  the  census 
results  for  underenumeration  as  measured  in  the  PES.  The 
estimate  so  derived  was,  given  the  deficiencies  of  the  various 
collections,  sufficiently  close  to  the  population  derived  from 
demographic  analysis  as  to  make  the  PES  estimate 
acceptable. 

It  was  realized  that  it  would  be  unrealistic  to  use  a  small 
(two-thirds  of  one  percent  of  households)  survey  to  adjust  all 
census  results,  especially  those  for  small  areas. 

Therefore  it  was  decided  that: 

(a)  A  clearer  distinction  would  need  to  be  made  by  the 
ABS  between  "census  results"  and  "population 
estimates."  Population  estimates  are  based  on  census 
results  but  are  adjusted  for  underenumeration. 
Population  estimates  are  produced 

•  Annually,  showing  the  total   population  for  each 
LGA; 

•  Annually,  showing  the  total   population  for  each 
State  by  age  and  sex;  and 

•  Quarterly,  showing  the  total   population  for  each 

State  by  sex. 

As  well,  a  "civilian  population  15  years  and  over"  is 
produced  for  each  State  by  sex  and  age  on  a  monthly 
basis  for  use  in  estimation  in  the  monthly  labor  force 
survey.  These  are  projected  estimates  and  are  super- 
seded by  the  quarterly  estimates  when  actual  data  be- 
come available. 

(b)  Census  results  as  such  would  not  be  adjusted  for 
underenumeration. 

(c)  Only  basic  population  estimates  would  be  adjusted. 
Requests  for  underenumeration  adjustment  in  addi- 


tional areas  or  for  other  characteristics  would  be  met 
by  giving  indications  of  underenumeration  but  not 
"officially"  providing  estimates.  That  is,  the  ABS 
would  be  prepared  to  give  qualitative  results  from  the 
postenumeration  survey  rather  than  quantitative. 

The  decision  to  produce  adjusted  population  estimates, 
while  having  significant  political  effects,  was  not  politically 
influenced— it  was  purely  an  ABS  decision  as  to  what  was 
technically  best.  The  general  acceptance  of  this  decision  is 
largely  the  result  of  the  range  of  adjustment  factors  being 
relatively  small  and  the  fact  that  most  allocations  are  based 
on  percentage  of  a  fixed  total  rather  than  on  a  per  capita 
fixed  amount. 

Our  population  estimates  are  a  potpourri  of  concepts,  and 
changes  over  recent  years  can  best  be  summarized  by  table  2. 

There  is  no  conceptual  justification  for  the  current  series 
nor  for  the  "prior  to  1976  census"  series.  They  are  neither 
"de  facto"  nor  "de  jure"  estimates  but  a  mixture  of  both. 
The  ABS  is  planning  to  move  to  a  fully  resident  based  series 
following  the  1981  census.  The  procedure  adopted  after  the 
1976  census  thus  produced  two  important  changes  to  the 
population  estimates: 

(a)  The  adjustment  for  underenumeration,  and 

(b)  The  inclusion  in  estimates  of  Australians  temporarily 
overseas  (and  the  corresponding  exclusion  of  visitors 
to  Australia).  This  is  largely  to  avoid  the  seasonal 
effect  of  tourism  on  population.  See  appendix  A  for 
table  showing  the  seasonal  pattern  and  the  net  effect 
of  short-term  movements. 

The  "error"  in  the  intercensal  adjustment  is,  as  best  we 
can  estimate,  relatively  small.  As  the  1971  census  was  subject 
to  a  different  level  of  underenumeration  (about  2  percent), 
we  see  from  table  3  that  the  Australia  total  (1)  is  about 
one-fourth  way  between  columns  2  and  3.  The  same  cannot 
be  said  of  the  individual  States,  as  levels  of  under- 
enumeration may  have  varied  considerably  for  a  particular 
State  between  1971  and  1976.  As  well,  since  the  de  facto 
1971  census  figures  were  updated  only  for  permanent 
(internal)  moves,  any  variation  in  the  stock  of  visitors 
between  1971  and  1976  in  a  particular  State  would  lead  to 
apparent  "errors"  in  any  comparison. 

HOW  THE  ACTUAL  ADJUSTMENT  WAS  DONE 

Once  the  decision  to  adjust  was  made,  the  procedures 
adopted  were  relatively  straightforward.  I  shall  discuss  the 
production  of  State  estimates  first,  then  the  local  govern- 
ment authority  estimates. 

State  Estimates 

The  underenumeration  rates  for  each  State  (as  measured  in 
the  coverage  survey  of  the  PES— see  appendix  B  for  a  brief 
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Table  2.  Intercensal  Estimates  of  Population 


Base 

Intercensal  adjustment 

Series 

Internal 

2 

External 

Internal 

2 

External 

Prior  to  1976  census 

De  facto,  as 
enumerated 

De  facto 

De  jure 

De  facto 

1976  to  1981 

De  facto, 
adjusted  for 

De  facto 

De  jure 

De  jure 

underenumeration 

Planned  for  post- 

De  jure. 

De  jure 

De  jure 

De  jure 

1981  census 

adjusted  for 

underenumeration 

'"Internal"  refers  to  the  population  within  Australia. 

2"External"  refers  to  population  movements  between  Australia  and  the  rest  of  the  world. 
"External  de  facto"  therefore  means  that  Australians  temporarily  outside  Australia  would  not  be 
included  in  the  Australian  population,  and  overseas  visitors  to  Australia  would  be  included. 
"External  de  jure"  means  that  such  temporary  movements  would  be  excluded  from  calculations. 


Table  3.  Population  by  State,  June  1976 


(Thousands) 


State 


Estimated 

population 

1971  census 

base 


1976  census 
as  recorded 


1976  census 
adjusted 
for  under- 
enumeration 


As  percentage  of  column  totals 


Total 


13,642.8 


13,548.5 


13,915.5 


100.00 


100.00        100.00 


NSW 

VIC 

QLD 

SA 

WA 

TAS 

NT 

ACT 


4,829.4 
3,696.0 
2,014.9 
1,244.7 
1,144.3 

410.2 
98.5 

204.6 


4,777.1 
3,647.0 
2,037.2 
1,244.8 
1,144.9 

402.9 
97.1 

197.6 


4,914.3 

3,746.0 

2,111.7 

1,261.6 

1,169.8 

407.4 

101.4 

203.3 


35.40 

35.26 

35.32 

27.09 

26.92 

26.92 

14.77 

15.04 

15.18 

9.12 

9.19 

9.07 

8.39 

8.45 

8.41 

3.01 

2.97 

2.93 

0.72 

0.72 

0.73 

1.50 

1.46 

1.46 

description  of  the  survey)  were  used  directly  to  produce  the 
estimates  of  total  underenumeration  for  each  State.  Sex  and 
age  estimates  for  the  State  were  then  produced  by  de- 
mographic analysis;  the  PES  rates  were  not  used  for  other 
than  total  population. 

For  a  historical  series,  the  picture  is  very  complex 
because  of  the  effect  of  other  conceptual  changes  (e.g., 
inclusion  of  aboriginals,  exclusion  of  net  short  term  visitors). 
However,  the  procedure  was  as  follows. 

•  Assume  1961  "as  recorded"  and  1976  "as  adjusted" 
were  "correct." 

•  Assume  1971-1976  births,  deaths,  and  migration  were 
correct.  This  led  to  an  implicit  underenumeration  in 
1971  of  about  2  percent.  The  1971  PES,  which  was 
known  to  have  a  number  of  serious  deficiencies,  had 
measured   1.35  percent.  The  2  percent  was,  however, 


allocated    according    to    the    State   underenumeration 
rates   as    measured    in    1971,    multiplied    by    2/1.35, 
approximately. 
•  The  "error"  between  1971  and  1961  was  then  spread 
over  the  10  years. 

As  said  above,  the  procedure  was  complex,  and  for  almost  all 
external  purposes  a  break  in  the  series  is  shown  before  June 
1971,  with  a  footnote: 

"The  estimates  for  1971  for  each  State  and  Territory  are 
made  from  the  1971  census  results,  with  augmented 
adjustments  for  underenumeration  to  make  the  total 
balance  with  the  estimates  for  Australia  made  retro- 
spectively from  1976"  (from  1979  Australian  Year  Book, 
p.  80). 
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LGA  estimates 

Corresponding  LGA  adjustments  for  underenumeration  were 
made  to  bring  them  into  line  with  State  estimates.  These 
adjustments  are  approximate,  but  I  believe  it  was  better  than 
doing  nothing.  As  has  been  seen  from  the  earlier  discussion, 
population  estimates  are  not  a  critical  component  of  electoral 
distribution  or  fund  allocation  for  LGA's. 

From  the  PES,  estimates  of  underenumeration  rates  for 
the  individual  strata  of  the  PES  survey  (there  are  approxi- 
mately 500  strata  in  the  survey)  were  obtained.  Within  each 
State,  these  strata  were  grouped  into  relatively  homogeneous 
groups  such  that 

1.  The  standard  error  of  the  percentage  underenu- 
meration for  the  total  of  each  group  of  strata  should 
not  exceed  25  percent;  and 

2.  The  lowest  underenumeration  rate  must  be  no  more 
than  30  percent  less  than  the  highest  under- 
enumeration rate  within  a  group. 

The  underenumeration  rate  for  the  grouped  strata  was  then 
allocated  to  all  LGA's  (or  parts)  within  the  grouped  strata. 
As  a  general  rule,  the  grouped  strata  were  along  the  lines 

•  inner  city  (high  underenumeration) 

•  inner  suburbs  (moderate  underenumeration) 

•  "suburbia"  (low  underenumeration) 

•  nonmetropolitan  urban  (low  underenumeration) 

•  nonmetropolitan  rural     (moderate  underenumeration) 

The  historical  series  of  LGA  estimates  were  adjusted  back  to 
1971  in  virtually  the  same  way  as  the  State  and  have  been 
updated  annually  since  1976.  No  cross-classifications  are 
adjusted  at  the  LGA  level. 

EFFECT  ON  OTHER  USERS 

One  question  that  has  been  raised  in  the  documentation 
from  the  U.S.  Census  Bureau  is  "Has  the  credibility  of  the 
Bureau  (and  the  census)  suffered?"  The  answer  is  a  cautious 
"no."  I  am  cautious  for  two  reasons: 

1.  The  ABS  has  not  run  a  census  since  making  the  change, 
and  I  would  expect  that  any  criticism  from  the  general 
public  would  not  occur  until  census  time. 

2.  The  ABS  was  severely  criticised  after  the  estimates 
were  released,  but  this  was  not  for  what  we  had  done, 
but  rather  for  the  fact  that  we  did  not  explain  to  the 
users  what  the  various  estimates  meant;  we  confused 
the  users  with  the  proliferation  of  estimates.  Within  a 
few  months,  users  had  to  contend  with: 


September  1976: 
Sept.- Dec.  1976: 
January  1977: 

February  1977: 
March  1977: 
June  1977: 


Post-1971  censal  estimates  for 
June  1976 

Preliminary    1976,  census   counts 
for  LGA's  for  June  1976 
"Final"    State    totals    from    pre- 
liminary   census    processing     for 
June  1976 

State  estimates  adjusted  for 
underenumeration  for  June  1976 
Release  of  LGA  unadjusted  esti- 
mates from  preliminary  processing 
Release  of  LGA  adjusted  esti- 
mates 


I  have  no  doubts  that  this  confusion  will  be  avoided 
following  the  1981  census.  I  repeat  that  the  adjustment  itself 
has  not  come  under  criticism,  and  there  has  been  no  backlash 
on  the  ABS  for  making  the  adjustment. 

It  would  be  foolish  to  say  that  the  adjustment  decision 
had  not  produced  some  problems  for  users,  especially  where 
they  are  using  population  estimates  data  and  then  need  to 
disaggregate  the  analysis.  In  such  situations,  they  are  forced 
to  use  unadjusted  census  data  and  are  no  worse  off  than  if 
the  ABS  had  not  adjusted. 


CONCLUSIONS 

It  is  a  fact  of  census  taking  that  a  census  will  always  have 
errors,  both  in  coverage  and  content.  How  important  these 
errors  are  depends  on  their  size  and  the  uses  to  which  the 
data  are  put. 

In  Australia  in  1976,  the  size  of  the  undercoverage  was 
such  as  to  discredit  the  census  by  itself  for  one  of  its  major 
legislative  uses— the  determination  of  electoral  represen- 
tation. Adjustment  was  a  necessary  step,  but  it  was  an 
after-the-event  decision.  We  had  not  anticipated  the  high 
undercoverage. 

It  is  important  to  stress  that  the  detailed  census  results 
were  not  adjusted  for  underenumeration.  Only  the  inter- 
censal  population  estimates  series  were  adjusted  to  include 
the  undercount  estimate.  The  population  estimates  are  very 
restricted  in  the  amount  of  detail  shown,  but  they  are 
used  in  determining  electoral  representation  and  the  alloca- 
tion of  revenue  to  States. 

I  see  adjustment  as  a  method  of  patching  up  holes.  It  is 
better,  however,  to  prevent,  if  possible,  the  emergence  of  the 
holes  in  the  first  place.  For  1981,  we  will  be  doing 
considerably  more  to  improve  coverage.  However,  under- 
enumeration will  occur,  and  I  feel  that  the  ABS  will  continue 
to  adjust  population  estimates  for  underenumeration. 
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APPENDIX  A 


NET  EFFECT  OF  SHORT-TERM 

MOVEMENTS  BETWEEN  AUSTRALIA 

AND  OVERSEAS 


Period 


1971 


1972 


March 

June 

Sept. 

Dec. 

March 

June 

Sept. 

Dec. 

1973  March 
June 
Sept. 
Dec. 

1974  March 
June 
Sept. 
Dec. 

1975  March 
June 
Sept. 
Dec. 

1976  March 
June 
Sept. 
Dec. 

1977  March 
June 
Sept. 
Dec. 

1978  March 
June 


Australian 
residents 


Visitors 


23,043 
-33,130 
20,213 
-1 1 ,445 
26,622 
-45,739 
19,818 
-14,258 

39,334 

-54,735 

16,790 

-18,688 

48,859 
-65,716 

24,379 
-24,954 

43,652 

-52,036 

16,190 

-39,012 

66,186 
-62,929 

28,897 
-37,688 

78,268 
-71 ,406 

29,943 
-34,381 

66,574 
-78,044 


Total 


-18,889 

4,154 

-10,939 

-44,069 

-  4303 

15,910 

16,502 

5,057 

-19,980 

6,642 

-15,865 

-61 ,604 

-  2,875 

16,943 

23,803 

9,545 

-25,929 

13,405 

-13,717 

-68,452 

-  1,103 

1 5,687 

20,972 

12,284 

-13,237 

35,622 

-  7,566 

-73,282 

3,740 

28,119 

34,368 

9,414 

-21,200 

22,452 

-13,916 

-65,952 

5,664 

21,854 

39,021 

9 

-18,730 

47,456 

-  7,360 

-70,289 

5,272 

34,169 

40,218 

2,530 

-19,444 

58,824 

-10,907 

-82,313 

4,544 

34,487 

48,145 

13,764 

-28,198 

38,376 

-  7 ,860 

-85,904 

APPENDIX  B 

PERSON  COVERAGE  CHECK  OF  THE 
POSTENUMERATION  SURVEY  (1976  CENSUS) 

The  person  coverage  check  (PCC)  was  conducted  as  part 
of  the  postenumeration  survey.  This  survey  was  run  2  weeks 
after  census  night,  and  it  was  designed  to  produce  estimates 
of  net  underenumeration  of  persons.  The  PCC  was  a  0.67- 
percent  sample  of  private  dwellings  across  Australia. 

Persons  living  in  nonprivate  dwellings  (e.g.,  hotels,  motels, 
hospitals)  and  sparsely  settled  areas  were  excluded  from  the 
postenumeration  survey  because  of  operational  difficulties  in 
conducting  follow-up  interviews.  However,  these  amount 
only  to  about  5  percent  of  the  population,  and  hence  any 


underenumeration  of  them  is  unlikely  to  have  a  significant 
effect  on  the  overall  level  of  underenumeration. 

The  postenumeration  survey  sought  only  a  limited 
amount  of  information  from  sample  households,  i.e.,  sex, 
age,  marital  status,  country  of  birth,  and  employment  status. 
The  results  relating  to  total  number  of  persons  are  subject  to 
a  standard  error  of  approximately  0.04  percent  at  the 
Australian  level,  less  than  0.1  percent  at  the  State  level,  and 
less  than  0.5  percent  in  the  Australian  Capital  territory  and 
the  Northern  territory. 

The  randomly  selected  houses  were  approached  by  450 
trained  interviewers  to  determine  the  number  of  persons 
staying  at  each  household  on  census  night.  The  estimated  net 
underenumeration  was  derived  by  comparing  the  people 
enumerated  on  census  night  as  stated  on  the  census  schedule 
with  the  people  living  at  that  household  on  census  night  as 
enumerated  by  the  interviewer. 

While  every  effort  is  made  to  minimize  underenumeration 
in  the  census,  some  inevitably  remain  for  various  reasons. 

1 .  Inadvertent  omission  of  very  young  children, 

2.  Persons  missed  because  the  dwelling  was  missed  by  the 
collector  (out-of-date  maps,  ill-defined  LGA 
boundaries,  etc.), 

3.  Treatment  by  the  collector  of  an  occupied  dwelling  as 
unoccupied,  and 

4.  Persons  in  occupied  dwellings  not  wishing  to  be 
included  on  the  census  household  schedule. 

Table  B-1.  Census  Underenumeration  as  Shown  by  the 

Postenumeration  Survey,  by  Age  Group  and 

Sex:  1976 


Age 


Males 


Females  All  persons 


All  ages 

0-4 

5-9 

10-14 

15-19 

20-24 

25-29 
30-34 
35-39 
40-44 
45-49 

50-54 
55-59 
60-64 
65-69 
70  and  over 


3.03 
2.91 
1.97 
1.82 
3.64 
5.55 

3.88 
3*63 
2.90 
2.51 
2.57 

3.22 
2.87 
1.61 
1.99 
2.82 


2.40 
3.08 
1.73 
1.54 
3.21 
3.90 

2.55 
1.43 
1.70 
1.82 
2.03 

2.67 
2.43 
2.19 
2.47 
2.73 


2.71 
2.99 
1.85 
1.68 
3.43 
4.73 

3.23 
2.56 
231 
2.18 
2.30 

2.94 
2.65 
1.91 
2.25 
2.76 


Results  from  the  Person  Coverage  Check.  Overall,  the 
PCC  revealed  that  2.7  percent  of  the  people  were  missed  in 
the  1976  census.  Some  general  conclusions  can  be  drawn 
from  the  analysis  of  the  PCC  file: 
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1.  The  underenumeration  rate  for  males  was  higher  than 
for  females  (3.0  versus  2.4  percent). 

2.  The  underenumeration  rate  varies  between  different 
age  groups  and  ranges  from  1.7  percent  for  the  10-  to 
14-year  age  group  to  4.7  percent  for  the  20-  to  24-year 
old  age  group.  In  fact,  the  rate  peaked  for  the 
20-year-old  age  group  at  6.2  percent 

3.  Divorced  people  (5.3  percent),  followed  by  people  who 


are  permanently  separated,  had  a  higher  underenumera- 
tion rate  than  married  people  (2.1  percent). 

4.  Persons  born  outside  Australia  except  the  United 
Kingdom  had  a  higher  underenumeration  rate  (3.2 
percent)  than  persons  born  in  Australia. 

5.  Unemployed  persons  had  an  underenumeration  rate  of 
6.4  percent  compared  with  3.0  percent  for  employed 
persons. 


Census  Undercount: 

The  International  Experience 


Meyer  Zitter  and  Edith  K.  McArthur 

Bureau  of  the  Census 


A  review  of  the  practices  of  most  of  the  developed  coun- 
tries of  the  world  relative  to  census  undercount  suggests  that 
the  United  States  does  not  have  a  monopoly  in  its  concern 
for  complete  coverage  of  the  population  in  conducting  its 
censuses  and  in  measuring  and  evaluating  their  accuracy. 
However,  the  similarity  stops  there.  The  broad-based  issues 
being  addressed  at  this  conference,  the  impact  on  public 
policy,  the  adequacy  of  existing  methods  in  measuring  com- 
pleteness of  coverage,  the  legality  and  equity,  and  the  con- 
cerns of  a  wide  variety  of  policymakers  and  data  users  as 
expressed  in  the  materials  being  presented  here,  do  not 
emerge  as  being  very  important  for  most  of  the  countries. 

The  minor  role  played  by  census  undercount  as  an  issue  of 
public  policy  is  also  confirmed  by  the  paucity  of  interna- 
tional discussion  and  debate.  As  far  as  the  writers  know, 
census  undercount  and  its  implications  for  public  policy  has 
not  been  a  full-scale  agenda  item  of  the  regular  meetings  of 
the  Conference  of  European  Statisticians,  of  the  Economic 
Commission  for  Europe  (ECE)  countries,  or  of  the  expert 
group  meetings  convened  periodically  to  address  specific 
topics  of  concern,  although  there  is  evidence  of  substantial 
interest  in  measuring  and  evaluating  completeness  of  census 
coverage.  For  example,  a  review  of  agenda  topics  of  the 
regular  annual  meetings  of  the  Conference  of  European 
Statisticians  shows  great  concern  for  the  quality  and  precision 
of  census  results.  Indeed,  there  are  recommendations  that 
the  precision  of  census  results  be  included  as  a  topic  for 
study  and  discussion,  that  quantitative  evaluations  and  mar- 
gins of  error  be  published,  and  that  information  be  compiled 
concerning  the  methodology  used  by  various  countries  in 
the  conference  that  carried  out  coverage  checks  by  com- 
paring census  results  with  population  registers.  But  there  has 
been  no  meeting  dedicated  to  a  discussion  of  the  broad 
concerns  being  addressed  here. 

This  conference,  thus,  is  the  first  of  its  kind  and  may 
establish  the  model  for  other  countries  sure  to  follow  in  the 
years  to  come. 

BACKGROUND 

This  paper  reviews  the  experience  of  other  countries  on 
the  general  issue  of  census  undercount.  The  report  has  two 
separate  components:  First,  a  review  is  made  of  how  selected 
countries,  i.e.,  those  participating  in  the  Conference  of 
European  Statisticians,  handled  the  general  problem  of 
census  coverage  and  the  undercount  issue  in  connection 
with  the  last  census.  Second,  a  review  is  made  of  data  for 
developing  countries  in  the  data  files  of  the  International 
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Demographic  Data  Center  (IDDC)  at  the  Census  Bureau  to 
provide  additional  perspective  on  methods  of  measurement 
of  coverage  and  on  the  level  of  census  completeness  in  de- 
veloping countries. 

This  paper  is  not  intended  as  an  in-depth  study  of  inter- 
national practices.  Rather,  it  is  designed  to  provide  tone  and 
flavor  as  to  the  general  level  of  concern  of  other  developed 
and  developing  countries  on  the  undercount  issue.  It  is 
narrow  and  limited  in  scope  for  a  number  of  reasons.  In  the 
first  part  of  the  study,  inquiries  were  made  only  of  countries 
participating  in  the  Conference  of  European  Statisticians 
operating  under  the  auspices  of  the  ECE.  The  study  was 
limited  to  these  countries  because  it  was  believed  that  the 
state  of  development  of  the  statistical  systems  and  the  role 
statistics  play  in  shaping  public  policy  more  closely  parallel 
the  state  of  the  art  and  practices  in  the  United  States.  All 
these  countries  were  "developed"  countries  as  classified  by 
the  United  Nations.  Furthermore,  the  review  relative  to  the 
practices  of  the  countries  in  the  ECE  relies  primarily  on 
information  received  in  response  to  our  inquiry  raising  the 
several  questions  detailed  below.  There  were  no  attempts  or 
opportunities  for  additional  detailed  correspondence  or 
discussion,  or  for  elaboration  and  clarification  of  a  number 
of  points  relating  to  census  coverage.  There  is  also  the  ques- 
tion as  to  whether  "all  the  right  questions  were  asked"  if  one 
were  trying  to  do  a  thorough  analysis  of  the  practices  and 
concerns  of  countries  relevant  to  census  undercount.  For 
example,  we  did  not  inquire  as  to  the  uses  of  census  data  and 
postcensal  estimates  which  are  important  variables  relative  to 
the  importance  of  the  issue.  (In  the  United  States,  for  ex- 
ample, the  major  concern  with  census  undercount  did  not 
emerge  until  the  1970's,  as  more  and  more  money,  Federal 
legislation,  and  political  representation  were  being  impacted 
by  census  results.)  Nor  could  we  benefit  from  personal 
discussion  and  input  as  in  the  case  of  Australia.  However, 
with  these  limitations,  the  review  is  designed  to  tell  us  much 
about  where  many  of  the  countries  stand  relative  to  the 
United  States  in  the  role  that  coverage  issues  play  in  census 
activities. 

Letters  were  sent  to  25  countries  (see  appendix  A)  of  the 
ECE  region  raising  the  following  questions: 

1 .  Do    you     routinely     measure    the    completeness    of 
coverage? 

2.  What  kind  of  methodology  is  used? 

3.  How    extensive    is    the    geographic    detail    for  which 
coverage  estimates  are  prepared? 
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4.  Are  the  census  data  adjusted  and  if  so,  what  specific 
uses  are  made  of  the  adjusted  series? 

5.  Are  census  data  (unadjusted)  used  for  some  purposes, 
whereas  adjusted  figures  are  used  for  others? 

6.  Is   the   question   of  census   undercount  an   important 
issue  in  your  country? 

Replies  were  received  from  24  countries  and  are  sum- 
marized below  in  table  1.  Narrative  summaries  for  each 
country  are  given  in  appendix  B. 

In  addition  to  this  review  of  the  practices  of  the  ECE 
countries,  a  separate  review  was  made  of  the  data  available  for 
developed  countries  in  the  demographic  data  files  of  the  IDDC 
of  the  Census  Bureau.  The  Center,  as  part  of  its  responsibilities 
in  compiling  demographic  data,  routinely  puts  together  what 
is  known  about  the  accuracy  and  reliability  of  census  and 
survey  data  for  the  developing  countries  around  the  world. 
Material  for  some  25  countries  (of  1  million  or  more  popu- 
lation) in  which  postenumeration  surveys  or  other  evalua- 
tions of  census  coverage  were  undertaken  are  also  summarized 
here  to  provide  the  conference  with  some  general  perspective 
on  the  levels  of  completeness  of  census  coverage  around  the 
world.  There  are  two  kinds  of  figures  shown— the  "official" 
estimates  of  undercount,  usually  based  on  postenumeration 
surveys  (PES)  or  other  types  of  studies,  and  the  "adjusted" 
levels  of  undercount,  based  on  the  IDDC  review  and  evalua- 
tion. These  data  are  presented  in  table  2.  Source  notes  are 
given  in  appendix  C. 

SUMMARY  OF  RESPONSES  OF  DEVELOPED 

COUNTRIES  TO  CENSUS  BUREAU 

COVERAGE  INQUIRY 

In  response  to  the  first  item,  "do  they  measure  coverage 
completeness,"  18  of  the  24  countries  that  replied  said  "yes, 
some  comparison  study  was  done,"  4  replied  they  did  not 
measure  coverage  completeness.  One  country  (France)  said 
"yes,"  but  referred  mainly  to  a  study  conducted  in  1962, 
and  one  country  (Sweden)  said  "no  comprehensive  study," 
but  coverage  of  housing  units  and  economically  active 
persons  is  measured.  Furthermore,  two  of  the  countries  that 
said  "no"  (Denmark  and  Norway)  were  depending  primarily 
on  population  registers  rather  than  a  conventional  census  for 
census-type  data.  Thus,  the  majority  of  these  countries  do 
conduct  some  study  of  completeness  of  census  coverage. 

As  to  "method  of  measurement"  of  census  coverage,  there 
are  two  main  measures  of  coverage  accuracy  that  emerge 
from  the  responses:  A  PES  type  of  survey  and  comparisons 
with  existing  population  registers.  Of  those  countries  that 
conducted  coverage  studies,  over  half  have  population 
registers,  some  of  which  are  very  extensive  and  compre- 
hensive. In  Norway,  the  Netherlands,  and  Finland,  the  census 
is  drawn  from  the  register,  and  Denmark  in  1980  will  move 
to  a  "census"  completed  by  compiling  data  using  identifica- 
tion  numbers  from   the   several   population   registers   rather 


than  the  traditional  type  census.   Incidentally,  in  reviewing 

the   materials   on    methods   of   measuring   completeness   of 

coverage,    it   appears  that    in  some  countries,  measures  of 

coverage  are  derived  as  byproducts  of  procedural  and  quality 

control  operations  carried  out  as  part  of  the  basic  census 

instead  of  through  a  separate  study. 

On  the  issue  of  "completeness  of  coverage,"  the  material 

suggests  a  very  low  rate  of  undercoverage  for  the  developed 
countries.  The  amount  of  indicated  undercount  runs  from 
"negligible,"  indicated  by  Czechoslovakia,  to  a  comment  of 
"they  have  complete  enumeration"for  the  U.S.S.R. However, 
this  means  in  the  U.S.S.R.  that  all  errors  discovered  in  the 
several  postcensal  operations,  including  their  postenumera- 
tion survey,  are  corrected  on  the  census  records.  In  countries 
making  comparisons  with  registers,  it  is  not  always  clear 
whether  the  undercount  measure  is  a  net  rate.  Several  of 
the  countries  appeared  to  consider  that  the  census  was  more 
accurate  than  their  register  system.  In  the  case  of  Italy,  the 
register  was  1.7  percent  higher  than  the  census  count,  but 
the  inference  is  that  it  was  less  reliable  than  the  census;  the 
register  failed  to  keep  track  of  a  large  emigration  rate, 
especially  from  southern  Italy.  Finland's  reference  to  a  2.7 
percent  undercount  of  population  meant  that  2.7  percent 
of  persons  on  the  register  did  not  respond  to  the  census. 
For  these  persons,  vital  information  was  tabulated  from  the 
register;  only  labor  force  characteristics  were  not  available 
for  omitted  persons.  In  Canada,  the  coverage  methodology 
used  was  a  reverse  record  check  accomplished  by  tracing 
a  sample  of  persons  drawn  from  various  records  (between 
1970  and  1975)  to  ascertain  whether  they  were  recorded  in 
the  1976  census.  Only  in  Bulgaria  and  in  Spain  were  there 
indications  of  an  overcount— 0.06  percent  and  2.3  percent, 
respectively.  In  Bulgaria  this  was  due  to  an  excess  of  double- 
counted  persons  over  omitted  persons.  In  Spain,  the  over- 
count was  of  the  de  jure  population.  The  figures  are  cited  to 
provide  some  idea  as  to  coverage  levels  as  measured  and 
reported  by  these  countries.  No  attempt  was  made  to 
evaluate  the  reliability  of  the  methods  described  or  of  the 
results  obtained. 

With  regard  to  the  undercount  differential,  most  of  the 
countries  reporting  an  undercount  also  indicate  some  dif- 
ferentials in  coverage  for  different  groups  of  the  population 
or  for  different  geographic  regions.  Austria,  for  example, 
with  an  overall  reported  undercount  of  0.4  percent,  indicated 
that  foreign  workers  were  undercounted  by  as  much  as  20 
percent  based  on  comparison  with  work  permits  that  had 
been  issued.  Also,  certain  types  of  persons,  for  example, 
young  persons,  single  persons,  and  males,  were  more  likely  to 
be  counted  wrong  in  Finland,  Bulgaria,  and  Canada.  In 
Canada,  also,  the  undercoverage  of  persons  with  a  non- 
English    mother    tongue   was   higher   than   for  the   general 

population. 

Most  of  the  countries  provided  measures  of  undercoverage 

at  the  national  level  only.  Finland  indicated  figures  available 

to  the  communal  level,  some  475  areas,  and  Sweden  provided 

measures  for  regional/metropolitan  areas.  Thus,  in  general, 
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Table  2.  Census  Coverage  Measurements  in  Selected  Developing  Countries  with  Populations  of  1  Million  or  More 


Country 


Date 

of  latest 

census 


Enumerated 
population 
(thousands) 


Coverage 

measure 

used 


Estimated 

percent 

undercoverage 


IDDC 

estimate  of 

undercoverage1 


Notes— use  of 

coverage 

estimates2 


Algeria 

Bangladesh 

Bolivia 

Cameroon 

Chile 

Colombia 
Ecuador 
El  Salvador 

Hong  Kong 

India 

Indonesia 

Iran 

Israel 

Jordan 

Korea 
Liberia 


Malaysia 
(peninsular) 

Pakistan 


1977 

1974 
1976 
1976 

1970 

1973 
1974 
1971 

1976 

1971 
1971 
1976 

1972 

1961 

1970 
1974 

1970 
1972 


16,260 

71,479 
4,613 
7,132 

8,885 

21,056-21,238 
6,522 
3,555 

4,420 

547,950 

119,232 

33,662 

3,148 


"Implied" 
1,638 


31,466 

"Implied" 
1,338 


8,810 


65,309 


PES 

PES 
PES 
PES 

Demographic 
analysis 

PES 

PES 

Demographic 
analysis 

Time  series 
and  PES 

PES 

PES 

PES 

Demographic 
analysis 

PES 

PES 
PES 


PES 


PES 


Paraguay 

1972 

2,358 

PES 

Peru 

1972 

13,538 

PES 

Sierra  Leone 

1974 

2,735 

other 

3.4 

6.4 
7.0 
6.9 

4.8 

6.6-9.0 
2.4 
3.6 


.66  (time  series) 
.42  (PES) 


4.1 


6.3 


Evaluation 
incomplete 

No  IDDC  evaluation 

No  IDDC  evaluation 

No  IDDC  evaluation 


Estimate  accepted 
after  evaluation 

9.4 

2.6 

Estimate  accepted 
after  evaluation 

IDDC  accepts  estimate 
from  time  series 


Official  adjustment 
of  census  count 

(NA) 

(NA) 

Official  adjustment 
of  census  count 

Official  adjusted 
population 

(NA) 

(NA) 

Official  adjusted 
population 

(NA) 


1.6 

2.7 

(NA) 

3.5 

4.9 

(NA) 

3.0 

Evaluation 
incomplete 

(NA) 

0.3 

Estimate  accepted 

Official  adjusted 

after  evaluation 

population 

4.0 

4.3 

Adjusted  figures 
are  used  as  official 
counts 

5.0 

5.2 

(NA) 

11.0 

No  IDDC  evaluation 

Adjusted  figures 
are  used  as  official 
counts 

4.7 


Estimate  accepted 
after  evaluation 


(NA) 


(NA) 


8.9 

9.9 

(NA) 

4.7 

4.9 

(NA) 

8.9 

No  IDDC  evaulation 

Official  adjusted 
population 

(See  footnotes  at  end  of  table.) 
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Table  2.  Census  Coverage  Measurements  in  Selected  Developing  Countries  with  Populations  of  1  Million  or  More  —  Continued 


Date 

Enumerated 

Coverage 

Estimated 

IDDC 

Notes— use  of 

Country 

of  latest 

population 

measure 

percent 

estimate  of 

coverage 

census 

(thousands) 

used 

undercoverage 

undercoverage' 

estimates2 

South  Africa 

1970 

21,794 

Method  of 
deriving 
time  series 
is  unknown 

2.6 

No  IDDC  evaluation 

Official  adjusted 
population 

Sri  Lanka 

1971 

1 2,690 

PES 

0.3 

1.0 

(NA) 

Sudan 

1973 

14,114 

PES 

4.8 

No  IDDC  evaluation 

Official  adjusted 
population 

Thailand 

1970 

34,397 

PES 

1.7 

6.6 

(NA) 

Yemen  (Sana'a) 

1975 

4,520 

PES 

2.9 

No  IDDC  evaluation 

(NA) 

'The  International  Demographic  Data  Center  analyzed  census  and  coverage  methodologies  to  determine  their  accuracy.  In  those  countries  not 
having  "IDDC"  estimate,  only  the  country's  own  estimate  of  undercoverage  is  shown. 

2  For  further  information  on  sources  of  data  from  individual  countries,  see  appendix  C  on  sources  and  comments  on  data  from  selected 
developing  countries. 


only  national  data  are  available  with  very  little  geographic 
detail  emerging. 

The  relative  importance  of  the  issue  also  varied  greatly 
between  responding  countries.  Census  undercount  was  not 
an  important  issue  in  most  of  the  countries.  However,  ac- 
cording to  some  of  the  responses,  it  had  not  been  important 
because,  until  the  recent  census,  undercount  was  believed  to 
be  minimal.  In  Finland,  the  lack  of  labor  force  characteristics 
for  the  2.7  percent  of  the  population  missed  was  felt  to  be 
important,  but  they  did  not  indicate  the  impact  of  the 
missing  data  on  statistics  or  policy.  West  Germany  offered 
the  comment  that  the  problem  of  an  undercount  is  becoming 
increasingly  important.  In  Canada,  although  no  adjustments 
to  census  data  were  made  in  1976,  the  possibility  of  adjust- 
ing results  for  undercoverage  in  1981  is  being  seriously  con- 
sidered. We  have  already  heard  about  the  case  in  Australia 
where  the  issue  was  important  enough  to  warrant  the  atten- 
tion of  the  legislature.  In  the  Netherlands,  the  issue  of 
coverage,  in  terms  of  whom  to  enumerate,  is  currently 
receiving  attention  from  the  legislature.  Through  the  last 
census  in  1971,  the  census  was  used  to  check  the  accuracy 
of  the  population  registers;  however,  in  1981,  the  census 
may  be  limited  to  persons  already  listed  in  the  population 
registers.  This  restriction  is  being  considered  in  order  to  avoid 
the  appearance  of  tracking  down  illegal  aliens  in  the  Nether- 
lands. For  the  remaining  countries,  the  issue  was  either  not 
serious  or  nonexistent;  thus,  a  conference  on  the  undercount 
would  not  have  attracted  much  attention  in  these  countries. 

The  final  question  was  "Are  census  data  adjusted?"  Only 
Finland  and  Sweden  (and  Australia,  of  course)  indicated 
"yes."  All  other  countries  indicated  "no."  In  the  case  of 
Finland,  they  leave  it  up  to  the  user  to  decide  whether  to 
use  adjusted  or  unadjusted   series.    In  the  case  of  Sweden, 


adjusted   figures   are   for   regional    planning  purposes  only; 
they  are  not  used  for  "official  purposes." 

Thus,  the  picture  that  emerges  based  on  the  review  of  the 
countries  here  is  that  the  issues  being  addressed  by  this  con- 
ference are  not  of  general  concern.  Of  course,  the  type  of 
society,  census  conditions,  the  existence  of  population  regis- 
ters, and  the  uses  of  census  data,  provide  an  entirely  different 
environment  for  census  taking  and  the  purposes  of  census 
data.  So,  perhaps  it  is  not  unexpected  that  the  experience  as 
indicated  is  so  unlike  that  of  the  United  States.  On  the  other 
hand,  it  is  the  writers'  personal  observations  that  some 
increase  in  interest  and  concern  on  this  issue  is  occurring. 
Whereas  the  United  States  routinely  measures  the  under- 
count as  part  of  the  census  operation,  in  the  past,  the  very 
concept  of  a  census  undercount  in  these  countries  would 
have  appeared  to  be  alien  to  their  thinking— it  was  not 
considered  by  the  professional  statisticians  apart  from 
concerns  of  quality  of  census  content.  The  view  seems  to 
be  changing  and  more  of  the  countries  of  western  and 
northern  Europe  will  see  this  emerging  as  an  important  issue 
in  later  censuses.  In  fact,  one  indication  of  increasing  interest 
is  that  the  general  question  of  whether  census  data  are 
adjusted  for  use  in  postcensal  estimates  may  be  included 
as  an  agenda  item  of  the  Conference  of  European  Statis- 
ticians. Perhaps  the  next  conference  on  the  undercount, 
modeled  along  the  present  one,  could  be  convened  sometime 
before  the  1990  census  and  be  truly  international  in  scope 
and  take  place  under  the  auspices  of  the  United  Nations. 
With  the  1980  census  almost  upon  us,  it  is  not  too  early  for 
some  appropriate  international  party,  such  as  the  United 
Nations  or  one  of  its  regional  bodies,  to  serve  as  the  catalyst 
for  encouraging  international  discussion  of  the  many  issues 
considered  at  this  conference  relative  to  census  undercount. 
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THE  EXPERIENCE  OF  SELECTED  DEVELOPING 

COUNTRIES  IN  CENSUS  COVERAGE 

MEASUREMENT 

The  second  part  of  the  study  of  the  international  experi- 
ence with  census  coverage  measurement  involved  a  review  of 
data  on  developing  countries  collected  by  the  International 
Demographic  Data  Center  at  the  Census  Bureau.  All  of  the 
26  countries  with  populations  of  1  million  or  more  included 
in  this  review  conducted  some  sort  of  coverage  evaluation 
after  their  censuses  were  completed.  Twenty  of  these  devel- 
oping countries  derived  their  estimates  through  a  post- 
enumeration  survey.  Those  that  didn't  have  a  survey  used 
some  form  of  demographic  analysis  to  derive  their  estimates 
of  census  coverage. 

Among  the  developing  countries  with  enumerated  popu- 
lations of  1  million  or  more  that  were  studied  by  IDDC,  the 
estimated  rates  of  undercoverage  varied  significantly  from 
0.3  percent  in  Sri  Lanka  to  1 1  percent  in  Liberia. 

It  is  evident  from  table  2  and  from  our  own  experience 
that  measuring  census  undercount  even  at  the  national  level 
(in  countries  with  deficient  or  weak  statistical  reporting 
systems)  is  no  easy  matter.  The  measures  derived  from  a 
conventional  PES  are  not  always  acceptable  at  face  value  and 
may  require  adjustment  for  more  accuracy  in  reflecting 
actual  levels  of  underenumeration.  Again,  the  data  we  present 
here  are  intended  to  provide  some  additional  perspective  on 
the  overall  issue  of  census  undercount. 


APPENDIX  A 

DEVELOPED  COUNTRIES  CONTACTED 

FOR  EXPERIENCE  WITH  CENSUS 

COVERAGE  MEASUREMENT 


1977 

estimated 

population 

Country 

Received  reply 

(thousands) 

Switzerland 

Yes 

6,289 

United  Kingdom 

Yes 

55,956 

USSR 

Yes 

258,900 

Yugoslavia 

Yes 

21,768 

1977 

estimated 

population 

Country 

Received  reply 

(thousands) 

Australia 

Yes 

14,062 

Austria 

Yes 

7,522 

Belgium 

Yes 

9,827 

Bulgaria 

Yes 

8,805 

Canada 

Yes 

23,323 

Czechoslovakia 

Yes 

15,030 

Denmark 

Yes 

5,089 

Federal  Republic  of 

Germany 

Yes 

61,392 

Finland 

Yes 

4,740 

France 

Yes 

53,103 

Greece 

Yes 

9,252 

Hungary 

Yes 

10,648 

Ireland 

No 

3,196 

Italy 

Yes 

56,436 

Netherlands 

Yes 

13,853 

Norway 

Yes 

4,044 

Poland 

Yes 

34,698 

Portugal 

Yes 

9,725 

Romania 

Yes 

21 ,664 

Spain 

Yes 

36,351 

Sweden 

Yes 

8,255 

Source:  U.S.  Department  of  Commerce,  Bureau  "of  the 
Census.  World  Population:  1977— Recent  Demographic 
Estimates  for  the  Countries  and  Regions  of  the  World. 
Washington,  D.C.  1978. 


APPENDIX  B 

SUMMARY  OF  CENSUS  COVERAGE 
ACTIVITIES  IN  SELECTED 
DEVELOPED  COUNTRIES 

Australia 

Latest  census:  1976 

Census  counts  were  adjusted  to  account  for  under- 
coverage—adjusted  current  population  figures  were  used  for 
apportionment  of  electoral  seats  and  funds  to  States— 
because  their  PES  showed  substantial  variation  in  amount 
of  undercount  between  States.  (They  feel  that  their  adjust- 
ment procedure  could  have  been  better  if  they  had  known 
prior  to  the  census  that  an  adjustment  would  have  to  be 
made  so  that  their  PES  could  have  been  altered  to  provide 
better  estimates  of  undercount.) 

Although  census  counts  were  adjusted  in  the  current  popu- 
lation estimates  to  reflect  the  estimated  undercount  to 
provide  official  population  figures,  the  census  results  them- 
selves were  not  adjusted  resulting  in  two  sets  of  numbers. 
(One  of  the  major  problems  was  that  users  were  not  ade- 
quately informed  about  the  adjustments.) 

Historical  series  were  adjusted:  1961  figures  left  unad- 
justed, 1971  figures  adjusted  by  1.35  percent,  1966  by  0.5 
percent,  and  the  1976  figures  by  2.7  percent  (leaving  some 
implicit  underenumeration  in  1966  and  1971). 

Census  figures  were  adjusted  for  official  population  counts 
down  to  the  local  level  depending  upon  metropolitan  density 
characteristics.  They  feel  confident  about  State  adjustments 
but  not  of  those  below  State  level;  however,  no  adjustment 
greater  than  4  percent  was  made,  and  none  were  negative. 

The  PES  sample  size  was  2/3  percent  of  households  in 
order    to    provide    reliable   estimates  on   characteristics   of 
missed  persons. 
Respondent:    Brian    Doyle,    Director,    Evaluation    and   User 

Services  Section,  Australian  Bureau  of  Statistics 

Austria 

Latest  census:  1971  (Accuracy  has  been  studied  since  1961.) 

Coverage  of  the  population  is  not  a  serious  issue  or 
problem. 
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1.  In  1971,  a  sample  of  their  1.5  percent  microcensus  was 
matched  case  by  case  with  the  census  forms.  The 
census  count  was  higher  than  the  microcensus  count, 
but  they  are  not  sure  which  was  more  correct. 

2.  On  the  aggregate  level,  comparisons  with  voting  lists 
also  revealed  a  higher  census  figure  for  persons  19  years 
old  and  over  (approximately  0.5  percent)  than  did  the 
voting  lists.  But  the  voting  lists  were  not  necessarily 
complete. 

3.  From  a  comparison  of  the  number  of  foreigners' work 
permits  at  the  time  of  the  census,  it  is  felt  that  foreign 
workers  are  undercounted  by  as  much  as  20  percent 
(0.4  percent  of  the  total  population). 

Accuracy  measures  are  at  the  national  level  only  and  they 
do  not  adjust  results  of  the  census  because  they  are  not  sure 
of  the  reliability  of  the  accuracy  measure— the  measures  are 
really  only  good  for  pointing  up  weak  areas. 

A  more  detailed  study  of  accuracy  measures  from  the 
1971    census    is   planned.   The    methods   used    in    coverage 
studies  will   be   improved   in    1981;  but  because  of  a  tight 
budget,  a  postenumeration  survey  will  not  be  done. 
Respondent:  Dr.  Lothar  Bosse,  President,  Austrian  Statistical 

Office 

Belgium 

Latest  census:  1970 

The  census  is  used  to  update  population  registers  that  are 
kept  at  commune  level.  Registers  have  records  of  residence 
of  Belgian  nationals  and  of  foreigners  who  are  resident  in 
Belgium. 

Registers  are  updated  by  vital  records  and  reports  of 
arrivals  and  departures  from  the  country,  by  usual  place  of 
residence. 

The  registers  are  "renewed"  by  the  census  results.  Census 
forms  are  filled  in  by  heads  of  households  as  required  by  law. 
The  census  results  are  checked  against  the  registers  and 
entries  are  made  in  the  registers  after  verification  of  where 
the  error  or  the  omission  occurred. 

The  official  census  figures  have  been  found  to  be  less 
than  0.5  percent  lower  than  the  figure  for  the  same  year 
from  the  registers. 

No  adjustment  is  made  to  the  series  for  underestimation 
of  the  population   in  the  census.  They  are  not  concerned 
with  that  small  amount  of  undercoverage. 
Respondent:    A.    Dillaerts,    Director    General,    Ministry    of 

Economic  Affairs,  National  Statistical  Institute 

Bulgaria 

Latest  census:  December  1975  (population  register) 

Two  postcensal  sample  surveys  carried  out:  Population 
coverage  completeness  and  accuracy  of  registration. 

Sample  size  in  the  population  coverage  survey  was  3 
percent  (or  26,000  persons);  about  0.31  percent  of  population 


were  missed;  about  0.37  percent  were  double  enumerated, 
thus  the  net  error  was  a  0.6  percent  overcount;  0.39  percent 
were  incorrectly  enumerated  by  place  of  residence. 

•  Men  more  likely  to  be  counted  wrong  than  women. 

•  Highest  frequency  of  errors  for  population  less  than  5 
years  old  and  15  to  25  years  old. 

•  Urban    population   more   frequently   miscounted   than 
rural,  also  large  towns  compared  to  small  towns. 

The  net  error  of  0.6  percent  was  considered  too  small  to 
be  concerned  about  adjustment  of  the  data. 
Respondent:   Committee  of  the  Unified  System  for  Social 

Information 

Canada 

Latest  census:  1976  (censuses  conducted  quinquennially) 

Methodology  for  measuring  undercount:  reverse  record 
check  (RRC).  First  used  on  a  small  scale  in  1961 ;  since  1966, 
conducted  on  full  scale.  The  RRC  in  1976  was  designed  to 
provide  measures  fo  undercoverage  of  population  down  to 
the  province  level  and  for  certain  population  subgroups.  The 
Yukon  and  Northwest  Territories  were  not  included  in  the 
RRC  because  of  the  greater  difficulties  and  higher  cost  that 
would  have  been  involved. 

The  sample  was  constructed  using  four  frames: 

1.  Persons  enumerated  in  1971  census 

2.  Persons  born  between  June  1,  1971  and  May  31,  1976 
(from  vital  statistics  records) 

3.  Immigrants    to    Canada    between   June    1,    1971    and 
May  31,  1976  (from  immigration  registrations) 

4.  Persons  not  enumerated  in  1971  census  (from  a  ran- 
dom sample  of  the  1971  reverse  record  check). 

The  total  sample  was  33,111  persons.  The  RRC  used 
higher  sampling  rates  in  smaller  provinces  and  for-subgroups 
expected  to  have  higher  undercoverage.  Each  of  the  33,111 
selected  persons  (SP)  was  searched  in  1976  census  returns. 
The  search  started  with  addresses  in  1971,  then  went  on  to 
telephone  directories,  social  and  welfare  agencies,  and  tax 
returns.  At  the  end  of  the  check,  each  SP  was  classified  as 
enumerated  in  1976  (88.2  percent),  missed  in  1976  (2.5 
percent),  died  (3.2  percent)  or  emigrated  (1.3  percent) 
before  1976,  or  tracing  failed  (4.8  percent). 

Results:  After  adjustments  to  cover  reweighting  for 
nonresponse  to  the  sample,  undercoverage  was  estimated 
at  about  2.04  percent  of  the  total  population  (about 
476,500  persons).  Significant  differences  were  found  in  the 
amount  of  undercoverage  by  province.  Quebec  had  the 
highest  rate  of  undercoverage  (2.95  percent).  Certain  sub- 
groups of  the  population  also  had  differentially  higher  rates 
of  undercoverage:  Young  persons,  males,  single  persons, 
persons  whose  mother  tongue  was  other  than  English 
(French  or  other).  Overall,  rates  of  undercoverage  seemed  to 
decline  from  2.6  percent  in  1 966  to  2.0  percent  in  1 976. 

Census    counts    were    not    adjusted    for    undercoverage 
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although  estimates  of  undercoverage  were  prepared  to  the 
provincial  level  and  for  other  demographic  variables.  But  the 
issue  of  preparing  adjusted  figures  is  being  seriously 
considered  for  1981 . 

The    issue    of    undercoverage    is    important    because    of 
federal/provincial  funds  transfers  which  are  linked  to  census 
counts. 
Respondent:    Ivan    P.    Fellegi,   Assistant   Chief  Statistician, 

Statistics  Canada 

Czechoslovakia 

Latest  census:  1970 

Completeness  of  coverage  measured  through  comparison 
with  population  movement  statistics  (population  register 
system  is  in  place). 

An  adjustment  on  the  basis  of  an  estimated  undercount  is 
not  made,  but  corrections  are  made  for  specific  detected 
omissions. 

Difference  between  census  data  and  population  movement 
statistics  "negligible." 
Respondent:    Jan    Kazimour,    President,    Federal    Statistical 

Office 

Denmark 

Lastest  traditional  population  census:  1970  (mail-out  type 
census;  population  registers  in  use) 

Since  1968,  a  number  of  registers— population,  dwellings, 
tax  information,  school  enrollment— have  been  established 
using  common  identification  numbers  of  persons  and  other 
units.  In  1976,  for  the  first  time,  a  "register  based"  census 
was  taken  combining  information  from  each  national  register. 
In  1981,  the  entire  census  will  be  taken  from  the  registers, 
using  the  identification  numbers  rather  than  forms  mailed 
out  to  and  filled  by  respondents  specifically  for  the  census. 
Respondent:  Lene  Skotte,  Denmarks  Statistik 

Federal  Republic  of  Germany 

Latest  census:  1970  (also  studied  1961  results  for  coverage) 

Studies  done  of  coverage  and  quality  of  the  census- 
estimated  undercount  of  about  0.9  percent  of  resident 
population.  Counts  are  not  adjusted.  Their  studies,  called 
process  controls  (or  descriptive  checks),  are  used  to  detect 
causes  of  errors  and  to  interpret  census  results.  Three 
separate  studies  were  made  of  census  results  starting  from 
samples  of  individual  documents. 

1.  Immediate  checks:  (1970)  A  followup  survey  of  0.1 
percent  to  0.2  percent  of  population  in  selected  con- 
trol districts  carried  out  4  to  6  weeks  after  census  day 
using  a  separate  questionnaire.  Its  purpose  is  to  check 
completeness  and  precision  of  coverage  of  buildings, 
households,  and  persons;  it  is  intended  particularly  to 
trace  omissions  and  double  counts. 


2.  Birthday  selection:  (1970)  All  persons  covered  by 
population  census  with  birthdays  on  31st  of  March, 
May,  and  July.  Coverage  of  residence  of  persons  with 
more  than  one  residence  to  eliminate  double  counting. 
(In  1961,  an  alphabetic  check  was  used  by  selecting  a 
sample  based  on  the  first  letter  of  surnames.) 

3.  Checks  of  characteristics:  Comparison  with  subsamples 
from  the  microcensus  to  ascertain  quality  of  the  data. 

Data  on  national  level  only.  In  1981,  the  issue  of  coverage 
completeness  will  be  of  greater  concern  due  to  problems 
resulting  from  poor  respondent  attitudes,  high  proportion 
of  foreigners,  and  increasing  mobility  of  population  (and 
multiple  residences). 
Respondent:   Dr.   Hamer,  Vice-President,  Federal  Statistical 

Office 

Finland 

Latest  full-scale  census:  1970  (In  1975,  a  small-scale  census 
or  microcensus  was  carried  out.) 

In  previous  censuses  (1950,  196U,  and  1970),  undercount, 
was  not  studied.  In  1975,  coverage  of  persons  and  dwellings 
investigated.  Undercount  studied  based  on  population 
register— everyone  registers  every  year  on  January  1 ;  each 
person  has  ID  number  and  characteristics  such  as  educational 
attainment  recorded.  The  census  forms  were  sent  out  to 
everyone  on  the  register. 

About  2.7  percent  of  the  registered  population  did  not 
return  their  census  forms.  Thus,  for  this  proportion  of  the 
population,  no  characteristics  are  available  beyond  what  was 
in  the  population  register;  that  means  for  those  persons,  no 
information  is  available  on  occupation  and  industry. 

About  2.0  percent  of  the  housing  units  were  omitted 
compared  to  records  of  construction  statistics  and  the  build- 
ing registers  which  are  maintained  in  some  communes.  This 
was  considered  an  "undercount,"  as  no  centralized  address 
register  exists. 

The  persons  not  counted  in  the  census  were  young  (less 
than  5  years  old),  20  to  29  years,  more  men,  more  divorced 
and  widowed. 

Geographic  detail  was  down  to  the  level  of  communes 
(475  in  Finland). 

Adjusted  data  were  prepared  at  commune  level  for  the 
population,  for  housing  units,  and  for  labor-force  data  by 
major  industry  grouping. 

The  user  decides  whether  to  use  adjusted  or  unadjusted 
data.  In  community  planning,  adjusted  data  used. 

Coverage  in  the  census  is  important  because  prior  to  the 
1975  census,  nonresponse  was  assumed  to  have  been  negligi- 
ble. In  1980,  coverage  improvement  work  will  be  intensified, 
better  information  about  the  census  will  be  provided,  and  a 
special  study  of  reliability  based  on  interviews  will  be  carried 
out. 
Respondent:  Olavi  E.  Miitamo,  Director,  Central  Statistical 

Office 
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France 

Recent  censuses:  1962,  1968,  1972 

Coverage  study  was  done  for  1962  census,  nothing  more 
recent.  They  feel  that  in  the  succeeding  censuses,  the  level  of 
coverage  declined  by  comparison  with  "current  population 
evaluations."  However,  this  may  not  be  true  because  their 
migration  statistics  are  of  doubtful  quality. 

Measurement  of  coverage  in  1962  revealed  a  net  under- 
coverage  of  1 .7  percent.  Coverage  was  measured  by  use  of: 

1.  Comparison  with  existing  registers, i.e.,  voting,  social 
security,  birthdate. 

2.  Field  checks  to  catch  double  counts. 

3.  A  postcensal  survey  using  area  samples.  The  total 
sample  size  was  20,000  housing  units— 400  area  sample 
units  of  about  50  housing  units  each.  The  sampling 
units  were  intentionally  small  so  each  would  be  rela- 
tively homogeneous.  The  area  sample  units  were 
defined  on  maps  by  physical  landmarks.  Sample  size 
was  1  in  1,000  for  most  of  the  country,  1  in  500  in 
urban  areas  in  excess  of  80,000  population,  and  1  in 
250  in  areas  of  the  greatest  population  concentration. 
The  area  sampling  units  proved  to  be  small,  making 
the  survey  a  delicate  and  unwieldy  task. 

Coverage  of  the  census  is  considered  a  statistical  problem 
rather  than  a  political/administrative  one.  Census  results  are 
used  administratively  but  the  local  governments  themselves 
are  responsible  for  the  census  in  their  own  communities  and 
do  the  recruiting  for  enumerators  in  their  area. 

No  systematic  adjustment  was  made  to  the  census  results 
in  1962  for  coverage.  A  "population  evaluation"  sample  is 
drawn  from  the  crude  data  from  most  recent  censuses.  The 
few  corrections  that  were  made  were  not  related  to  the 
coverage  study. 
Respondent:  Edmond  Malinvaud,  Director  General,  National 

Institute  of  Statistics  and  Economic  Studies 

Greece 

Latest  census:  1971 

The  census  was  taken  in  1  day  by  enumerators,  to  ascer- 
tain a  de  facto  population.  Immediately  after,  a  sample 
survey  was  taken  to  measure  coverage  error.  Estimated  less 
than  1  percent  of  population  was  missed— only  available  at 
national  level.  Census  data  were  not  adjusted. 

The  sample  was  split:  One  part  was  used  to  measure  the 
proportion  of  housing  units  and  entire  households  missed; 
the  second  part  was  within  households,  how  many  persons 
were  missed. 

The  multistage  area  sample  was  selected  from  124  strata 
split  into  three  levels: 

1 .  Cities  of  more  than  40,000  persons 

2.  Cities  of  5,000  to  40,000  persons 

3.  The  rest  of  the  country 


Enumerators  were  generally  civil  servants  such  as  teachers. 

Census  undercount  is  not  considered  to  be  an  important 
issue  as  it  was  so  small;  however,  postenumeration  surveys 
will  continue  to  be  carried  out.  (in  1961,  according  to  an 
enclosed  report,  the  greatest  undercoverage  occurred  in 
Greater  Athens  but  was  balanced  by  large  duplications  in 
the  rest  of  Greece.) 
Respondent:     Chr.     Kelperis,     Director    General,    National 

Statistical  Service 

Hungary 

Latest  census:  1970 

A  postenumeration  survey  was  conducted  to  measure 
completeness  and  quality  of  the  content  of  the  census. 

It  was  not  used  to  correct  errors  in  the  census  but  rather 
to  detect  problems  and  plan  for  future  censuses.  The  sample 
size  was  0.25  percent  of  the  population.  The  survey  was 
conducted  like  a  second  census,  drawing  from  a  regional 
quartering  of  census  districts.  It  took  place  directly  after  the 
census.  The  sample  included  7,900  households  and  27,000 
persons.  Prior  to  the  time  of  the  actual  survey,  sampled 
households  were  contacted  to  ensure  cooperation. 

Although  the   PES  was  intended  to  measure  quality  of 
data    collected    rather   than    coverage,   0.4    percent   of  the 
population  was  estimated  to  have  been  missed. 
Respondent:  Dr.  Vera  Nyitrai,  President,  Central  Statistical 

Office 

Italy 

Latest  census:  1971 

Comparisons  were  made  of  census  counts  with  communal 
population  registers.  In  1971,  the  registers  were  about 
900,000  higher  (1.7  percent)  than  census  counts.  This  was 
blamed  on  failure  to  keep  the  registers  up  to  date  with 
emigration.  Southern  Italy  in  particular  has  high  emigration. 
They  now  concentrate  on  intercensally  updating  their 
communal  registers  for  emigration. 

Each  census  form  is  compared  with  register  cards  for  each 
commune.  Their  estimate  of  coverage  is  made  only  at  the 
national    level.   The  census  was  used  to  judge  the  quality 
of  the  communal  registers. 
Respondent:  Luigi  Pinto,  Director  General,  Central  Statistical 

Institute 

Netherlands  (Response  received  after  conference.) 
Latest  census:  1971 

Municipal  population  registers  are  the  basis  of  the  popula- 
tion census.  From  the  registers,  data  on  population  by  age, 
sex,  marital  status,  and  nationality  are  compiled  annually. 

Prior  to  1971,  the  census  was  required  to  count  all  per- 
sons who  were  supposed  to  be  on  the  register.  The  census 
was  used  to  check  the  completeness  and  accuracy  of  the 
register  by  carrying  through  a  case-by-case  check  of  all 
persons   listed  in  the  registers  and  on  the  census.   In   1971, 
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this  case-by-case  check  was  not  carried  out,  also  no  record 
was  made  of  the  illegal  aliens  in  the  Netherlands.  As  the 
case-by-case  check  was  not  carried  out,  a  difference  of  0.5 
percent  between  the  total  number  of  registered  persons  and 
persons  listed  in  the  census  remained. 

About  2.3  percent  of  the  persons  listed  on  the  municipal 
registers  failed  to  respond  to  the  census;  they  were  either 
not  at  home  or  refused  to  cooperate.  However,  charac- 
teristics for  these  persons  were  available  from  their  records 
in  the  municipal  registers,  and  thus  they  were  included  in 
census  publications. 

For  the  1981  census,  it  is  possible  that  legislation  will  be 
passed  requiring  the  census  to  exclusively  enumerate  persons 
already  registered  in  the  municipal  registers  to  prevent  any 
implication  of  the  census  being  used  to  track  down  illegal 
aliens  in  the  Netherlands. 
Respondent:  J.  Schmitz,  Head  of  the  Department  for  Social 

Accounts,  Central  Bureau  of  Statistics 

Norway 

Latest  census:  1970 

In  1970,  the  census  was  used  as  a  control  for  the  popula- 
tion register;  in  that  year,  the  central  register  was  matched  to 
the  local  registers.  Since  1970,  no  quality  study  of  the  regis- 
tration system  has  been  made,  but  it  is  felt  that  it  provides 
good  information  down  to  the  municipality  level. 

The  basis  of  the  census  in  1980  will  be  a  central  popu- 
lation register  at  the  Central  Bureau  of  Statistics.  The  Central 
Bureau  has  received  monthly  updates  from  local  registers 
since  1970.  The  1980  census  will  have  an  evaluation  survey. 
It  should  give  some  information  in  completeness  of  the 
register. 

They,  too,  are  trying  to  determine  how  to  present  mea- 
sures of  incompleteness  in  the  census. 
Respondent:     Erik    Aurbakken,    Acting    Director,    Central 

Bureau  of  Statistics 

Poland 

Latest  census:  1978 

In  1978,  for  the  first  time,  a  postcensus  investigation  on  a 
1 -percent  sample  was  conducted  to  check  correctness  of  the 
census  by  comparing  census  data  with  data  collected  by 
enumerators  in  dwellings  in  the  sample.  Results  of  this  study 
will  be  used  only  to  check  accuracy  of  the  census,  not  to 
adjust  it.  Poland  has  a  population  register  system. 

In  the  census,  the  data  were  used  to  correct  the  current 
estimates  of  the  population,  which  are  kept  current  from 
vital  statistics  and  migration  data.  Corrections  of  the  current 
estimates  based  on  the  census  at  the  national  level  varied 
from  0.3  to  0.4  percent,  so  no  problems  were  created  for 
data  users.  At  the  local  level,  greater  differences  between 
estimates  and  census  counts  were  found,  due  to  inaccuracies 
in  registration  of  internal  migration. 


Respondent:  Prof.  Stanislaw     Kuzinski,  President,  Central 
Statistical  Office 

Portugal 

Latest  census:  1970 
Forthcoming  census:  1981 

No  coverage  control  for  technical  accuracy  of  1970  census 
was  undertaken,  although  coverage  work  was  done  in  some 
areas  of  the  country. 

In  1981,  a  coverage  control  with  technical  accuracy  is 
planned.  The  methodology  to  be  used  is  now  being  studied 
by  the  National  Statistical  Institute.  They  will  send  the 
Bureau  a  report  on  the  methodology  when  the  research  is 
concluded. 
Respondent:     J.     F.    Graca    Costa,    President,    Council    of 

Direction,  National  Statistical  Institute 

Romania 

Latest  census:  1977 

Completeness  of  the  census  was  considered  satisfactory 
when  resident  population  from  the  census  was  approximately 
equal  to  figures  derived  from  current  records.  No  adjustment 
was  made,  but  detected  omissions  were  corrected  at  all 
geographic  levels. 

A  number  of  matching  studies  undertaken  during  the 
census  were: 

1.  Several  days  before  census  took  place,  a  preliminary 
visit  was  made  to  identify  housing  units  and  the 
number  of  persons  per  unit. 

2.  Matching  at  local  level  with  voting  lists,  agriculture 
register,  and  population  register. 

3.  At  centralized  level,  matching  of  migration  records 
within  country  to  responses  on  census  item. 

After  the  census,  a  "check  survey"  was  carried  out.  A 
sample  of  households  was  selected  by  a  two-stage  sampling 
process.  In  the  first  stage,  a  sample  of  census  sectors  was 
selected  and  in  the  second,  a  sample  of  households.  The 
final  sample  size  was  11,750  households  (0.18  percent  of 
total)  and  41,355  persons  (0.9  percent  of  total).  This  survey 
resulted  in  estimates  of  0.094  percent  of  households  and 
0.155  percent  of  individuals  in  households  were  missed  at 
the  national  level.  (Omissions  were  largely  due  to  inter- 
viewers misinterpreting  instructions.) 
Respondent:  Hie  §alapa,  Director  General,  Central  Statistical 

Office 

Spain  (Response  received  after  conference.) 

Latest  census:  1970 

Spain  maintains  a  population  register  listing  characteristics 
of  persons  by  usual  place  of  residence  and  previous  place  of 


residence. 
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After  the  1970  census,  an  evaluation  survey  was  carried 
out  at  the  national  level,  stratified  by  metropolitan  area,  to 
study  errors  in  coverage  and  content  in  the  census.  A  sample 
of  area  segments  was  selected  in  which  a  reinterview 
(although  the  method  of  the  census  was  partially  self- 
enumerative)  was  conducted  to  check  residence  and 
occupancy  in  housing  units  on  the  day  of  the  census. 

The  results  of  the  survey  revealed  a  2.3-percent  overcount 
of  the  de  jure  population.  As  a  consequence,  the  issue  of 
coverage  errors  is  not  considered  to  be  important. 

The    unadjusted   census   population    figures   are    used   as 
official  counts  for  administrative  as  well  as  other  official 
purposes.    However,    adjusted    figures   are    used   for   demo- 
graphic studies  and  population  projections. 
Respondent:     F.     Azorin,     President,     National     Statistical 

Institute 

Sweden 

Latest  census:  1975 

Censuses  were  conducted  in  1970  and  1975.  Two  popu- 
lation registers  are  maintained  (one  of  the  total  population 
and  the  second  for  use  in  sampling  all  persons  born  on  the 
15th  of  each  month). 

Completeness  of  coverage  is  not  measured,  but  each 
census  form  is  checked  off  against  register  entries.  The 
population  register  is  continually  updated.  In  evaluation 
studies  after  1970,  a  pilot  study  was  made  of  completeness 
of  coverage  of  housing  units  (conducted  by  mailmen).  Also, 
an  attempt  was  made  to  measure  completeness  of  coverage 
of  the  economically  active  population.  Census  data  were  then 
compared  with  data  collected  in  the  two  coverage  studies. 
Estimates  of  coverage  were  made  only  to  the  national  level 
and  that  of  some  metropolitan  areas. 

Adjustments  of  economically  active  persons  and  housing 
units  were  made  for  regional  planning  but  were  not  officially 
census  data. 

Sweden    is   concerned  with   coverage,  especially  for  the 
economically    active    population,    but   "coverage"  refers  to 
the  determination  of  whether  or  not  a  person  was  economi- 
cally active. 
Respondent:    Lennart    Fastbom,    Deputy  Director  General, 

National  Central  Bureau  of  Statistics 

Switzerland 

Latest  census:  1970 

No  special  study  of  completeness  of  coverage  was  made. 
Local  administrators  checked  incoming  questionnaires  against 
population  registers;  rectification  of  enumerator  errors  was 
completed  locally,  and  the  forms  were  sent  to  the  central 
census  office. 

Based  on  the  experience  of  larger  cities,  the  number  of 
individuals  missed  in  1970  was  not  more  than  0.3  percent 
of  the  total  population.  Census  results  are  not  adjusted. 


Respondent:    R.    Rotach,   Section    of  the   Census,    Federal 
Office  of  Statistics 

U.S.S.R. 

Latest  census:  1979 

Coverage  measures  used:  Postenumeration  survey,  issuance 
of  census  certificates,  and  completion  of  control  forms.  The 
postenumeration  survey  was  carried  out  directly  after  the 
census  in  25  percent  of  the  dwellings  in  urban  areas  and  in  all 
dwellings  of  25  percent  of  the  rural  sections  by  inspectors 
and  enumerators.  Inspectors  checked  registrations  of  all 
families  and  single  dwellers;  those  missed  by  the  census  were 
registered. 

Census  certificates  issued  to  all  transients  (long-  and 
short-term)  and  to  all  persons  even  contemplating  travel 
during  the  period  of  the  census  and  the  postenumeration 
survey. 

Control  forms  were  filled  by  enumerators  during  the 
census  and  filled  by  inspectors  during  the  PES  for  individuals 
who  thought  they  had  been  missed.  After  a  comparison  of 
the  control  forms  to  the  census  records,  information  for 
missing  persons  was  transferred  to  census  forms. 

The    respondent    indicated    that   these    measures  assured 
complete  enumeration  of  the  population. 
Respondent:  M.  A.  Korolev,  First  Deputy  Director,  Central 

Statistical  Board 

United  Kingdom 

Latest  census:  1971 

The  issue  of  an  undercount  is  not  considered  to  be  impor- 
tant; however,  they  feel  that  it  is  important  to  go  out  and 
prove  that  by  some  sort  of  coverage  study.  Census  results  are 
not  adjusted. 

In  1971,  reenumeration  and  independently  compiled  lists 
were  used.  Both  were  found  to  be  unsatisfactory.  (The  first 
was  not  independent;  the  second  was  not  controlled  for 
accuracy.) 

In  1981,  a  small  independent  unit  of  people  will  be  set  up 
in  a  central  office  as  well  as  in  the  field  with  the  sole  respon- 
sibility of  coverage  checks.  Addresses  will  be  checked  against 
local  taxation  lists.  Also  a  0.5-percent  sample  will  be  taken 
for  a  labor  force  survey  after  the  census,  identified  by  enu- 
meration district,  to  match  against  census  results.  The  sample 
will  be  larger  than  normal  so  the  special  coverage  unit  can 
conduct  reenumeration  interviews  on  a  sample  independent 
of  the  census. 

The  size  of  the  sample,  0.5  percent  of  the  addresses,  will 
allow  estimates  to  be  made  down  to  the  regional  level.  So  far, 
only  addresses  are  measured,  not  people. 
Respondent:  S.  C.  Boxer,  Head  of  Census  Division,  Office  of 

Population  Censuses  and  Surveys 

Yugoslavia  (Response  received  after  conference.) 
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Latest  census:  1971 

Following  the  census,  a  coverage  control  study  was 
carried  out  to  detect  omissions  or  "double  counting"  of 
persons,  households,  and  housing  units.  This  coverage  study 
was  conducted  in  a  stratified  sample  of  enumeration  districts 
covering  each  republic  and  autonomous  province.  Of  the 
total  83,593  enumeration  districts  in  Yugoslavia,  403  were 
in  the  sample. 

Results  of  the  survey  revealed  an  undercount  of  0.8  per- 
cent of  persons  and  about  0.9  percent  of  households.  There 
were  some  differences  in  the  proportions  of  undercount  in 
the  individual  provinces  and  by  rural  or  urban  residence; 
however,  these  differences  were  small.  Persons  who  were 
reported  to  be  employed  abroad  had  a  very  high  undercount, 
around  7.7  percent.  Census  results  were  not  adjusted  based 
on  the  results  of  the  coverage  control  study.  The  level  of 
reliability  of  the  census  results  as  revealed  by  the  results  of 
the  coverage  study  is  considered  to  be  important  information 
for  governmental  agencies  and  for  other  data  users.  The 
results  of  the  coverage  study  are  used  mainly  for  planning 
for  the  next  census. 
Respondent:    Ibrahim    Latific,    Director,   Federal  Statistical 

Office 

APPENDIX  C 

SOURCES  AND  COMMENTS  FOR  DATA  ON 
SELECTED  DEVELOPING  COUNTRIES 

Algeria 

Direction  des  Statistiques  et  de  la  Comptabilite 
Nationale.  La  Situation  De'mographique  en  Algerie  1967- 
1978,  p.  6.  Algiers,  1979. 

Official  adjustment  of  population  is  based  on  a  PES. 
Algeria  includes  adjustment  for  undercount  in  official 
postcensal  estimates. 

Bangladesh 

Bangladesh.  Census  Commission.  Bangladesh  Popula- 
tion Census  1974.  Bulletin  2.  Census  Publication  No.  26. 
Dacca,  1975.  Table  1. 

Bangladesh.  Census  Commission,  and  United  Kingdom. 
Ministry  of  Overseas  Development.  Report  on  the  1974 
Bangladesh  Retrospective  Survey  of  Fertility  and 
Mortality.  London,  1977,  p.  3. 

U.S.  Department  of  Commerce,  Bureau  of  the  Census 
WP-79  is  based  on  the  figure  in  table  2.  The  officially 
cited  underenumeration  estimate  of  6.88  percent  is 
derived  by  using  the  unadjusted  population  as  a  base. 
The  Bureau's  practice  is  to  base  the  percentage  on  the 
adjusted  population,  which  in  this  case  reduces  the  per- 
cent undercount  to  6.44  percent.  Bangladesh  bases  its 
official  time  series  on  this  adjusted  census  figure. 

Bolivia 


Unpublished  estimates  and  projections  prepared  by 
Mario  Gutierrez  Sardan  of  INE  (Instituto  National  de 
Estadi'stica,  La  Paz)  with  the  collaboration  of  CELADE 
(1979,  table  2). 

Official  adjustment  of  population  is  based  on  a  PES. 
Upon  preliminary  review,  INE's  analysis  seems  good;  a 
7-percent  figure  is  higher  than  most.  The  INE  figures  will 
be  official  once  projections  are  published. 

The  previous  Census  Bureau  estimate  of  underenu- 
meration was  4.23  percent,  based  on  an  analysis  of  pre- 
liminary sample  census  figures.  Pending  further  analysis 
of  the  final  census  and  of  the  results  of  the  PES,  the 
Census  Bureau  is  accepting  the  official  estimate  of  under- 
enumeration. 

Cameroon 

Cameroon  Direction  de  la  Statistique  et  de  la 
Comptabilite  Nationale.  Recensement  General  de  la 
Population  et  de  I' Habitat  d'Avril  1976.  Volume  II, 
tome  1  (1979).  Yaounde,  p.  7. 

The  official  estimate  of  underenumeration  is  based  on 
a  postenumeration  survey. 
Chile 

Chile  Oficina  de  Planificacion  Nacional.  Proyeccidn 
de  la  Poblacion  de  Chile  por  Sexo  y  Grupos  Quinquenales 
de  Edad.  1950-2000.  Santiago,  1975. 

U.S.  Department  of  Commerce,  Bureau  of  the  Census. 
Country  Demographic  Profiles— Chile  by  Sylvia  Quick. 
Washington,  D.C.,  1978. 

The  Chile  Oficina  de  Planificacion  Nacional  (1975) 
estimated  midyear  population  figures  for  every  year,  1950 
to  1970,  based  on  demographic  analysis.  The  U.S.  Bureau 
of  the  Census  (1978,  table  2)  estimated  the  adjusted 
census  population  for  1970  shown  above  based  on  the 
official  midyear  1970  population  and  the  1969  to  1970 
growth  rate. 

Colombia 

DANE  (low):  Departamento  Administrative  Nacional 
de  Estadi'stica  (DANE).  Boletin  Mensual  de  Estadi'stica, 
no.  314  (Sept.,  1977),  p.  31. 

DANE  (high):  Departamento  Administrative  Nacional 
de  Estadi'stica.  Boletin  Mensual  de  Estadi'stica,  no.  308 
(March  1977),  p.  9. 

Gonzalez.  XIV  Censo  Nacional  de  Poblacion  y  III 
Vivienda:  Cobertura  Censal,  1976,  table  7. 

U.S.  Department  of  Commerce,  Bureau  of  the  Census. 
Country  Demographic  Profiles:  Colombia.  Washington, 
D.C.,  1979,  table  2. 

The  range  in  the  "enumerated"  population  is  attrib- 
utable to  different  estimates  of  the  population  in  the 
Armed  Forces  and  the  population  in  sparsely  populated 
areas  (National  Territories).  Range  in  adjusted  census 
following  the  postenumeration  survey  of  1974  is  due  to  a 
range  in  estimate  for  Bogota,  the  indigenous  population. 
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the  Armed  Forces,  and  group  housing.  See  discussion  of 
census  and  PES  in:  Ocha,  L.H.  and  M.  Pardo.  "Estima- 
ciones  de  la  Poblacion  de  Colombia  en  1973:  Una  Recon- 
struccion  Critical."  Unpublished  working  paper.  Pontifica 
Universidad  Javeriana,  1979. 

It  is  not  clear  which  estimate  is  official,  as  new 
estimates  continue  to  be  published. 

U.S.  Census  Bureau  estimates  are  based  on  projections 
of  a    1964   adjusted   census   population,   and   estimated 
fertility,  mortality,  and  migration  data. 
Ecuador 

Results  of  postenumeration  survey  conducted  by 
Oficina  de  los  Censos  Nacionales  (OCN)  reported  by 
Cavallini,  G.  "Informe  de  la  Mision  de  Aseson'a  Realizado 
en  la  Republica  del  Ecuador  desde  el  6  al  30  de  Marzo  de 
1976."  United  Nations.  Unpublished  mission  report, 
1976,  p.  24. 

Estimates  of  OCN  and  Bureau  of  the  Census  are  very 
close.  An  alternative  evaluation  by  Carlos  Cavallini  via 
demographic  analysis  sets  total  underenumeration  at  4 
percent,  however,  differences  in  ages  over  5  between  the 
PES-Chandrasekaran-Deming  and  demographic  methods 
were  insignificant. 

El  Salvador 

Consejo  Nacional  de  Planificacion  y  Coordinacion 
Economica  (CON AP LAN).  Indicadores  Economicos  y 
Sociales,  January -June,  1976.  San  Salvador. 

and  Direccion  General  de  Estadi'stica  y  Censos 

(DIGESTIC).  La  Poblacion  de  El  Salvador  por  Sexo  y 
Edad  en  el  Periodo  1952-2000,  Principales  Indicadores 
Demograficos.  San  Salvador,  1976. 

Estimate  for  the  census  date  based  on  the  official 
adjusted  midyear  population  for  1970  (CONAPLAN  and 
DIGESTIC,  1976,  table  17)  and  an  estimated  midyear 
1970  to  midyear  1971  growth  rate. 

Hong  Kong 

For  PES: 

Hong  Kong  Census  and  Statistics  Department.  Country 
Report  of  Hong  Kong.  1977  Mimeo. 

Hong  Kong"  Population  and  Housing  Census 

1971:  Main  Report.  Hong  Kong,  1972,  p.  8. 
For  Official  Estimates: 

Hong  Kong  Monthly  Digest  of  Statistics.  Hong 


Kong,  1979,  table  15.1 


India 


U.S.  Department  of  Commerce,  Bureau  of  the  Census 
Country  Demographic  Profiles  -  India.  Washington,  D.C., 
1978, p. 4. 

India,  Registrar  General  and  Census  Commissioner. 
Census  of  India  1971.  General  Population  Tables.  Series 
1  -  India,  Part  ll-A  (i).  New  Delhi,  1975. 


Indonesia 

East -West  Population  Institute.  Proceedings  of  3rd 
Population  Census  Tabulation  Workshop-Conference, 
Postcensal  Considerations.  Honolulu,  1974,  p.  20. 

U.S.  Department  of  Commerce,  Bureau  of  the  Census. 
Country  Demographic  Profiles  -  Indonesia.  Washington, 
D.C.,  1979. 

Iran 

Statistical  Centre  of  Iran.  National  Census  of  Popula- 
tion and  Housing,  November  1976,  Based  on  5-Percent 
Sample,  Total  Country.  Table  1,  1978. 

Eorey,  Joseph.  U.N.  Development  Program  Office, 
Tehran.  "Progress  Report  on  the  1976  Iranian  Population 
and  Housing  Census."  Abstract  of  report  in  East-West 
Center,  East-West  Population  Institute.  Asian  and  Pacific 
Newsletter,  vol.  4,  no.  4  (May  1978),  p.  3.  Honolulu. 

Official  adjustment  of  population  is  based  on  a  PES. 
Iran  has  not  included  adjustment  for  undercount  in 
official  postcensal  estimates.  PES  result  is  "preliminary." 
IDDC  is  using  the  reported  3-percent  undercount  but 
suspects  the  undercount  was  higher. 

Israel 

Central  Bureau  of  Statistics.  The  Demographic  Charac- 
teristics of  the  Population  in  Israel  1972-1976.  Jerusalem, 
1978. 

Adjustment  was  made  based  on  1961  census  and 
births,  deaths,  and  migration.  Adjustment  was  made  for 
Jews  only  and  for  ages  0  to  9  years  only. 

Jordan 

Official  estimate  of  underenumeration  is  reported  in 
UNPVSR,  October  1979. 

Although  not  obtained  through  a  postenumeration 
survey,  the  method  used  to  arrive  at  4.0  percent  is  un- 
known. Official  census  volumes  and  subsequent  popula- 
tion estimates  utilize  the  official  adjusted  census  pop- 
ulation. 

International  Demographic  Data  Center  accepts  the 
4.3-percent  underenumeration  recommended  in  an  analy- 
sis by  Wander  (in  Central  Bureau  of  Statistics.  Analysis 
of  the  Population  Statistics  of  Jordan.  Amman,  1966). 
Note  that  the  4.3-percent  underenumeration  estimated  by 
Wander  is  published  by  the  same  organization  as  carried 
out  in  the  census.  This  figure  is  not  used  in  subsequent 
official  estimates. 

Korea 

Marks,  Eli  S.,  and  Glenda  Finch.  "Developments  in 
Techniques  of  Census  Evaluation."  Unpublished  paper 
presented  at  the  biennial  meeting  of  the  International 
Association  of  Survey  Statisticians.  New  Delhi,  1978. 

Department  of  State  Airgram,  No.  2-34,  March  23, 
1976. 
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U.S.  Department  of  Commerce,  Bureau  of  the  Census. 
Country  Demographic  Profiles— Republic  of  Korea. 
Washington,  D.C.,  1978. 

The  evaluation  of  the  1970  census  done  at  the  Census 
Bureau  was  based,  in  part,  on  the  results  of  the  1970 
PES,  which  are  not  accepted  as  official  estimates  of 
underenumeration.  Official  population  figures  that  are 
shown  in  publications  are  based  on  unadjusted  census 
data. 

Liberia 

Ministry  of  Planning  and  Economic  Affairs.  1974 
Population  and  Housing  Census.  Final  Population  Results 
for  Liberia  and  Major  Political  Divisions.  PC-1.  Monrovia, 
1977,  pp.  18  and  60. 

The  official  enumerated  population  is  not  available. 
The  de  jure  census  population  was  officially  adjusted, 
based  on  a  postenumeration  survey. 

Malaysia 
(Peninsular) 

Malaysia,  Department  of  Statistics.  An  Interim  Report 
on  the  Postenumeration  Survey.  Kuala  Lumpur,  1973. 

U.S.  Department  of  Commerce,  Bureau  of  the  Census. 
Country  Demographic  Profiles— Malaysia.  Washington, 
D.C.,  1979. 

Official  adjustment  of  population  is  based  on  a  PES. 
Estimates  of  underenumeration  derived  at  the  U.S.  Bureau 
of  the  Census  imply  that  the  PES  estimate  of  under- 
enumeration was  a  I  ittle  too  low. 

Pakistan 

Pakistan  Statistical  Division.  Census  Evaluation  Survey, 
Population  Census,  1972.  Karachi,  1974. 

Official  estimates  of  underenumeration  are  based  on 
the  results  of  a  PES.  Pakistan  does  not  use  the  inflated 
total  in  current  estimates. 

Paraguay 

PES  estimate  based  on  the  combined  procedures  esti- 
mate table  10.3  in  Marks,  Eli  S.  "The  Role  of  Dual  System 
Estimation  in  Census  Evaluation,"  Krotki,  Karol  (ed.). 
Developments  in  Dual  System  Estimation  of  Population 
Size  and  Growth.  Edmonton,  Alberta:  University  of 
Alberta  Press,  1978,  pp.  156-188. 

The  PES  results  are  preliminary,  and  there  has  been  no 
communication  from  the  Direccion  General  de  Estadi'stica 
y  Censos  as  to  whether  they  have  accepted  these  PES 
results.  It  should  also  be  noted  that:  (a)  There  were 
actually  two  procedures  used  which  yielded  different 
results.  The  above  results  are  a  combination  of  the  two. 
(b)  The  PES  did  not  cover  all  of  Paraguay;  it  excluded  the 
Chaco  and  some  other  sparsely  settled  areas  (Marks,  1978, 
p.  172)  and  the  institutional  population,  (c)  The  coverage 
estimates  are  subject  to  sampling  variability;  the  standard 


deviation  of  the  estimated  percent  completeness  is  about 
0.7  percent. 

The  U.S.  Bureau  of  the  Census  estimated  underenu- 
meration is  higher  due  to  the  use  of  the  PES  adjustment 
for  ages  5  years  and  over  and  an  adjusted  population  un- 
der age  5.  This  population  was  derived  from  the  adjusted 
female  population  and  estimates  of  fertility  and  mortality 
for  the  5  years  prior  to  the  census. 

Peru 

Oficina  Nacional  de  Estadistica  y  Censos  (ONEC). 
"Perspectives  de  Crecimiento  de  la  Poblacion  del  Peru 
1960-2000."  Boletin  de  Analisis  Demografico,  no.  16 
(1975),  p.  103. 

The  estimates  of  underenumeration  from  the  analysis 
of  ONEC  are  based  on  the  midyear  1970  population, 
reverse-survived  from  the  enumerated  June  4,  1972 
population,  compared  with  the  1970  population  corrected 
for  both  household  underenumeration  and  the  average 
number  of  persons  per  household.  The  corrections  used 
were  based  on  the  results  of  the  postenumeration  survey, 
which  indicated  an  underenumeration  of  3.46  percent  of 
households. 

The  procedure  used  by  the  U.S.  Bureau  of  the  Census 
to  estimate  underenumeration  for  Peru  is  similar  to  the 
one  described  above. 

Official  projections  from  CELADE  and  ONEC  imply 
4.9  percent  underenumeration. 

Sierra  Leone 

The  final  census  population  and  adjusted  population 
are  reported  in  PVSR  October  1979. 

No  postenumeration  survey  was  carried  out. 

South  Africa 

Department  of  Statistics.  South  African  Statistics  1978. 
Pretoria.  P.  1.4. 

Estimate  of  underenumeration  calculated  at  the  U.S. 
Bureau  of  the  Census  from  official  South  African  time 
series.  The  method  of  estimation  used  in  deriving  official 
South  African  time  series  is  unknown.  No  PES  was 
known  to  have  been  taken. 

Sri  Lanka 

Nadarajah,  Thambiah.  "Sri  Lanka.  The  1971  Census  of 
Population  and  Housing."  Introduction  to  Censuses  of 
Asia  and  the  Pacific  1970-74,  edited  by  Lee-Jay  Cho. 
Honolulu:  East-West  Population  Institute,  1976,  p.  179. 

Official  estimate  of  underenumeration  is  based  on  pre- 
liminary analysis  of  the  December  1971  postenumeration 
survey. 

Census  population  adjusted  at  the  U.S.  Bureau  of  the 
Census  is  based  on  the  preliminary  census  figure  of 
12,712,277,  adjusted  for  estimated  underenumeration 
based  on  demographic  analysis. 
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Official  Sri  Lankan  estimates  are  not  based  on  adjusted 
figures. 
Sudan 

Department  of  Statistics,  Population  Census  Office. 
1977.  Second  Population  Census  1973,  vol.  1,  tables  9 
and  12. 

Official  adjustment  of  population  is  based  on  a  PES. 
Sudan  has  not  regularly  included  adjustment  for  under- 
count  in  official  postcensal  estimates  (not,  for  example, 
in  estimates  reported  in  PVSR).  IDDC  is  using  the  official 
estimate  of  the  undercount  but  suspects  undercount  was 
higher. 

Thailand 

For  enumerated  population: 

National     Statistical     Office.     1970    Population    and 
Housing  Census.  Whole  Kingdom.  Bangkok,  1973.  Table 
1-A. 
For  estimated  underenumeration  and  adjusted  population: 
Arnold,    Fred,  and    Mathana    Phananirama.    Revised 
Estimates  of  the  1970  Population  of  Thailand.  Research 
Paper  No.  1,  National  Statistical  Office.  Bangkok,  1975. 
Table  13. 
For  Bureau  of  the  Census  estimates: 

U.S.  Department  of  Commerce,  Bureau  of  the  Census. 
Country  Demographic  Profiles— Thailand.  Washington, 
D.C.,  1978.  Tables  1  and  2. 

Yemen  (Sana'a) 

Steffen,  Hans.  Yemen  Arab  Republic:  Final  Report. 
Airphoto  Interpretation  Project,  Swiss  Technical  Coop- 
eration Service,  carried  out  for  the  Central  Planning 
Organization,  Sana'a,  1978,  pp.  I/57-59. 

Official  adjustment  of  population  is  based  on  a  PES. 
The  official  adjusted  population  includes  an  estimated 
population  of  48,602  living  in  areas  not  covered  by 
the  census.  This  adjustment  is  in  addition  to  the  PES 
adjustment. 

No  official  postcensal  estimates  by  the  government  of 
Yemen  (Sana'a)  are  available. 


APPENDIX  D     . 

LETTER  SENT  TO  CHIEF  STATISTICAL 

AGENCY  OF  EACH  SELECTED  DEVELOPED 

COUNTRY 

One  of  the  significant  emerging  issues  of  the  1970's  in 
connection  with  census  undertaking  is  the  completeness  of 
coverage  of  the  population.  In  the  United  States,  as  you  may 
have  read,  our  studies  have  shown  that  in  the  1970  census  we 
missed  about  2  1/2  percent  of  the  population  with  significant 
differentials  for  subgroups  of  the  population  and  for  age  and 
sex.  The  issue  of  the  undercount  in  the  United  States  census 
has  become  very  important  because  of  the  many  programs 
which  now  rely  on  census  data— from  the  distribution  of 
billions  of  dollars  of  Federal  funds  to  the  impact  on  political 
representation.  Since  this  issue  is  becoming  so  dominant 
relative  to  the  1980  census,  we  are  interested  in  learning 
about  the  experiences  of  other  countries  in  regard  to  census 
undercounts,  specifically: 

1.  Do  you  routinely  measure  the  completeness  of  coverage 
of  population  in  your  censuses? 

2.  What  kind  of  methodology  is  used? 

3.  How    extensive    is    the    geographic    detail    for  which 
coverage  estimates  are  prepared? 

4.  Are  the  census  data  adjusted  and,  if  so,  what  specific 
uses  are  made  of  the  adjusted  series? 

5.  Are  census  data  (unadjusted)  used  for  some  purposes 
whereas  adjusted  figures  are  used  for  others? 

6.  Is   the   question  of  census  undercount  an   important 
issue  in  your  country? 

I  would  very  much  appreciate  hearing  from  you  on  this 
subject.  Any  publications  or  other  material,  e.g.,  memoranda, 
that  your  office  has  issued  relative  to  the  question  on  census 
undercounts  and  adjustments  would  be  most  welcome. 

Sincerely, 

DANIEL  B.  LEVINE 
Acting  Director 
Bureau  of  the  Census 


Floor  Discussion 


In  response  to  a  question  on  how  many  people  do  not 
register  in  the  required  registration  for  voting  in  Australia, 
Mr.  Doyle  said  the  electoral  office  takes  over  corrected 
district  data  and  works  out  how  many  people  should  be 
registered.  In  areas  with  over  5-percent  nonregistration, 
regular  followup  surveys  are  carried  out.  Less  than  1  percent 
of  the  population  fails  to  register.  There  are  a  small  number 
of  illegal  immigrants,  who  arrive  by  boat,  but  internal  travel 
is  difficult  and  immigration  laws  are  strict. 

Another  member  of  the  audience  asked  when  the 
corrected  estimates  of  Australian  population  appear  after  a 
census.  Mr.  Doyle  said  that  estimates  based  on  the  1976 
census  will  still  be  produced  through  March  1982.  Corrected 
estimates  are  produced  continually.  The  1981  census  will  be 
incorporated  in  the  estimates  series  starting  about  1982.  In 
March  1982,  an  adjusted  total  population  figure  for  the  1976 
estimate  will  be  produced  using  the  1981  census  counts.  This 
estimate  for  1976  will  be  adjusted  for  undercount,  but  it  will 
also  be  used  to  adjust  the  de  facto  figure  to  a  de  jure 
basis— taking  out  visitors  to  Australia,  putting  back  in 
Australians  on  short-term  travel  overseas,  and  counting 
everyone  at  their  usual  place  of  residence  rather  than  where 
they  were  enumerated. 

It  was  emphasized  that  politicization  did  not  seem  to  be 
an  issue  in  most  of  the  countries  surveyed,  even  though  the 
numbers  from  the  census  are  used  for  political  purposes  in 
some  of  the  developing  countries.  Politicization  is  though  to 
be  growing  in  Canada,  however.  The  census  act  there 
specifies  that  the  population  is  what  the  Chief  Statistician 
says  it  is  for  very  large  formula  grant  programs.  As  a  result, 
the  decision  of  the  Chief  Statistician  is  being  questioned  on  a 
political  level.  The  constitutional  division  of  powers  is  similar 


in  Canada  to  that  in  the  United  States,  but  the  privileges  of 
the  provinces  are  far  more  jealously  guarded.  So  it  would  be 
anathema  to  Canadians  for  the  Federal  Government  to 
distribute  money  below  the  province  level.  To  the  extent 
that  the  issue  of  undercount  is  politicized,  it  is  only  to  the 
province  level,  not  below.  The  next  step,  though,  is  that  the 
provinces  have  to  distribute  money  within  themselves. 
Statistics  Canada  may  have  to  step  in  and  help  there. 

It  was  noted  also  that  Canada  seems  to  be  the  only 
country  that  consistently  sticks  to  the  reverse  record  check 
methodology.  The  demographic  analysis  technique  does  not 
work  well  in  Canada— there  is  a  great  deal  of  measurable 
emigration.  The  Canadian  PES  grossly  underestimated  the 
undercount,  whereas  the  reverse  record  check  seemed  more 
reliable.  There  is  a  great  deal  of  discussion  about  how  to 
break  down  the  PES  to  the  province  level. 

It  was  felt  that  the  Census  Bureau  is  committed  to 
demographic  analysis  at  the  national  level  and  that  the  PES 
will  be  used  below  the  national  level.  The  PES  tends  to  give 
geographic  variation.  Thus  what  the  Bureau  will  use  is  a 
combination  of  the  methods  to  come  up  with  its  estimates. 

One  aspect  ot  the  PES  operation  in  Australia  was  noted 
that  makes  it  more  reliable  than  it  would  be  in  Canada  or 
America.  The  Australian  PES  is  completed  3-5  weeks  after 
the  census  day,  which  improves  the  matching  capability 
much  more  than  a  6-month  break  between  the  census  and 
the  PES  in  the  United  States.  A  PES  reduces  matching 
problems  but  increases  correlation  problems,  that  is,  one 
tends  to  miss  the  same  people.  However,  there  is  a  big 
difference  in  whether  one  has  to  adjust  for  a  5-percent  or  a 
40-percent  difference  between  the  PES  and  the  demographic 
estimates. 
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INTRODUCTION 

This  paper  will  present  the  developing  law  on  the 
utilization  and  adjustment  of  the  decennial  census  of 
population.  The  permissibility  of  adjustments  to  the  census 
undercount  for  apportionment  of  Representatives  in 
Congress,  and  allowed  deviations  for  federally  funded 
programs  will  be  reviewed. 

Feasible  legal  considerations  by  the  Bureau  of  the  Census 
to  adjust  the  census  undercount  for  the  1980  decennial 
census  and  the  mid-decade  census  of  1985  will  be  suggested. 

Political,  policy,  and  administrative  considerations  are  not 
within  the  scope  of  this  commentary. 

BACKGROUND 

Article  I,  Section  2,  of  the  United  States  Constitution 
provided  at  its  initial  ratification  (1788). 

".  .  .Representatives  and  direct  Taxes  shall  be  apportioned 
among  the  several  States  which  may  be  included  within 
this  Union,  according  to  their  respective  Numbers,  which 
shall  be  determined  by  adding  to  the  whole  Number  of 
free  Persons,  including  those  bound  to  Service  for  a  Term 
of  Years,  and  excluding  Indians  not  taxed,  three-fifths  of 
all  other  Persons.  The  actual  Enumeration  shall  be  made 
within  three  years  after  the  first  Meeting  of  the  Congress 
of  the  United  States,  and  within  every  subsequent  Term 
of  ten  Years  in  such  Manner  as  they  shall  by  Law  direct. 
The  Number  of  Representatives  shall  not  exceed  one  for 
every  thirty  Thousand,  but  each  State  shall  have  at  Least 
one  Representative;.  .  ." 

Article  I,  Section  2  was  amended  by  Section  2  of  the  14th 
amendment  to  the  Constitution,  in  1868  in  part  as  follows: 

"Representatives  shall  be  apportioned  among  the  several 
States  according  to  their  respective  Numbers,  counting  the 
whole  Number  of  Persons  in  each  State,  excluding  Indians 
not  Taxed."  Borough  of  Bethel  Park  v.  Starts  449  F.2d 
575,578(1971)  USCA  3rd  Cir  (Penn). 

The  Constitution  embodied  Edmund  Randolph's  proposal 
for  a  periodic  census  to  ensure  "fair  representation  of  the 
people,"  an  idea  endorsed  by  (George)  Mason  as  assuring  that 
"numbers  of  inhabitants"  should  always  be  the  measure  of 


representation  in  the  House  of  Representatives.  Wesberry  v. 
Sanders  376  U.S.  13-14;  84  S.Ct.  526,533  (1964). 

A  census  is  an  official  enumeration  of  the  inhabitants  with 
details  of  sex,  age,  family,  etc.,  and  the  public  record  thereof; 
it  is  not  merely  a  sum  total,  but  an  official  list  containing  the 
names  of  all  the  inhabitants  (citations).  A  "census"  is  not  an 
estimate  of  the  population.  Union  Electric  Co.  v.  Cuiure 
River  Electric  Coop,  Inc.   571  S.W.2d  790,794  (Mo)  (1978). 

From  the  beginning,  Congress  has  consistently  provided 
for  the  mandated  enumeration.  At  the  First  Congress, 
Session  II,  the  First  Decennial  Census  Act  was  adopted 
March  1,  1790,  and,  in  part,  provided: 

Chap.  II.  An  Act  providing  for  the  enumeration  of  the 
Inhabitants  of  the  United  States. (a) 

Section  I,  Be  it  enacted  by  the  Senate  and  House  of 
Representatives  of  the  United  States  of  America  in 
Congress  assembled,  That  the  marshals  of  the  several 
districts  of  the  United  States  shall  be,  and  they  are  hereby 
authorized  and  required  to  cause  the  number  of  the 
inhabitants  within  their  respective  districts  to  be  taken; 
omitting  in  such  enumeration  Indians  not  taxed,  and 
distinguishing  free  persons,  including  those  bound  to 
service  for  a  term  of  years,  from  all  others;  distinguishing 
also  the  sexes  and  colours  of  free  persons,  and  the  free 
males  of  sixteen  years  and  upwards  from  those  under  that 
age; (1  Stat.  101) 

The  current  congressional  enactment,  at  title  13,  United 
States  Code,  follows  in  part: 

Section  141.  Population  and  other  census  information 

(a)  The  Secretary  shall,  in  the  year  1980  and  every  10 
years  thereafter,  take  a  decennial  census  of  population  as 
of  the  first  day  of  April  of  such  year,  which  date  shall  be 
known  as  the  "decennial  census  date,"  in  such  form  and 
content  as  he  may  determine,  including  the  use  of 
sampling  procedures  and  special  surveys.  In  connection 
with  any  such  census,  the  Secretary  is  authorized  to 
obtain  such  other  census  information  as  necessary, 
(d)  Without  regard  to  subsections  (a),  (b),  and  (c)  of  this 
section,  the  Secretary,  in  the  year  1985  and  every  10 
years  thereafter,  shall  conduct  a  mid-decade  census  of 
population  in  such  form  and  content  as  he  may  deter- 
mine, including  the  use  of  sampling  procedures  and  special 
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surveys,  taking  into  account  the  extent  to  which  infor- 
mation to  be  obtained  from  such  census  will  serve  in  lieu 
of  information  collected  annually  or  less  frequently  in 
surveys  or  other  statistical  studies.  The  census  shall  be 
taken  as  of  the  first  day  of  April  of  each  such  year,  which 
date  shall  be  known  as  the  "mid-decade  census  date"  .  .  . 
(Oct.  17,  1976  P.L  94-521,  Section  7(a),  90  Stat.  2461) 

The  object  of  the  census,  as  stated  in  the  case  of 
Loughborough  v.  Blake  Wheat,  31 7,320, 5L.  Ed  98  (1820),  is 
"to  furnish  a  standard  by  which  representatives,  and  direct 
taxes,  may  be  apportioned  among  the  several  States  which 
may  be  included  within  this  union." 

DISCUSSION 

At  the  outset,  it  should  be  understood  that  the  significant 
litigation  has  been  directed  toward  malapportionment  within 
the  several  States,  not  among  them. 

Stated  another  way  in  Meeks  v.  Avery,  251  F.Supp 
245,249-250  (D.C.  Kansas)  (1966): 

Reference  in  Article  I,  Sections  2  and  4;  in  Section  2  of 
the  Fourteenth  Amendment  to  the  Constitution;  and  in  2 
USCA  Section  2a,  to  the  enumeration  of  the  population 
of  the  various  States  have  to  do  with  the  apportionment 
of  representatives  among  the  States,  not  within  them. 

Serious  question  has  not  been  raised  as  to  apportionment 
of  representatives  "among  the  several  States  according  to 
their  respective  numbers."  There  has  likewise  been  no 
credible  suggestion  that  anything  but  an  actual  decennial 
enumeration  should  be  the  basis  for  apportionment  of  the 
representatives  "among  the  several  States." 

The  Constitution,  as  amended,  mandated  it  and  the 
Congress  has  legislated  it. 

The    rush    of    cases    in    the    1960's    sought   to    correct 

malapportionment  of  State  legislative  districts  (Baker  v.  Can 

369  U.S.  186,  82  S.Ct.  691,  7LEd2d  663  (1962);  Gray  v. 

Sanders  372   U.S.  368,  83  S.Ct.   801,  9L.Ed  821   (1963); 

Reynolds  v.  Sims  377  U.S.  533,  84  S.Ct.  1362  1 2L.Ed2d  506 

(1964))   and   malapportionment  within  State  boundaries  of 

congressional  districts  (Wesberry  v.  Sanders  376  U.S.  1,  84 

S.Ct.  526,  11L.Ed2d  481  (1964);  Kirkpatrick  v.  Preisler  394 

U.S.    526,  89  S.Ct.   1225  22L.Ed2d   519  (1969):  Wells  v. 

Rockefeller    394    U.S.    542   80  S.   Ct.    1230   22L.Ed    535 

(1969)). 

As  soon  as  the  apportionment  issue  shifts  to  "within  State 

boundaries,"  we  experience  a  full  range  of  attempts  to  use 

something   other   than    the    published   enumeration    of   the 

decennial     census    for    reapportionment    of    congressional 

districts  (within  States). 

In  Wesberry  v.  Sanders,  supra,  at  page  18,  the  court  said. 

While    it    may    not    be    possible   to   draw  congressional 


districts  with  mathematical  precision,  that  is  no  excuse  for 
ignoring  our  Constitution's  plain  objective  of  making 
equal  representation  for  equal  numbers  of  people  the 
fundamental  goal  for  the  House  of  Representatives.  That 
is  the  high  standard  of  justice  and  common  sense  which 
the  founders  set  for  us. 

In  Kirkpatrick  v.  Preisler,  supra,  at  page  535,  on  a 
Missouri  reapportionment  statute  creating  congressional 
districts,  the  U.S.  Supreme  Court  suggested  that  there  may 
be  instances  where  something  other  than  an  actual  enumera- 
tion would  suffice: 

...  We  recognize  that  a  congressional  districting  plan  will 
usually  be  in  effect  for  at  least  10  years  and  five 
congressional  elections.  Situations  may  arise  where  sub- 
stantial population  shifts  over  such  a  period  can  be 
anticipated.  Where  these  shifts  can  be  predicted  with  a 
high  degree  of  accuracy.  States  that  are  redistricting  may 
properly  consider  them.  By  this  we  mean  to  open  no 
avenue  for  subterfuge.  Findings  as  to  population  trends 
must  be  thoroughly  documented  and  applied  throughout 
the  State  in  a  systematic,  not  an  ad  hoc,  manner.  .  .  . 

Large  and  obvious  changes  in  population  may  be  a  basis 
for  using  information  other  than  the  enumerated  decennial 
census.  In  Shalvoy  v.  Curran,  393  F.2d  5557  USCA  2  Cir 
(USDC  Conn.)  (1968),  the  court  stated: 

. .  .  While  census  figures  are  a  proper  basis  for  population 
determination  of  a  particular  place  in  reapportionment 
cases  over  the  ten  years  following  its  taking,  under 
circumstances  such  as  those  here  presented  where  such 
large  and  obvious  changes  in  population  have  occurred  as 
to  be  subject  to  judicial  notice,  the  census  report  of  seven 
years  before  is  not  immune  from  challenge. 

In  Meeks  v.  Avery,  251  F  Supp  245,  250  (USDC  Kans.) 
(1966),  the  court  found  that  use  by  the  Kansas  legislature  of 
the  1964  State  enumeration  figures  instead  of  the  1960 
Federal  census  as  the  basis  for  determination  of  population 
was  nothing  more  than  the  exercise  of  judgment  in  the 
legislative  process,  and  found  no  constitutional  fault  with  the 
choice  made. 

It  should  be  noted  that  the  Kansas  count  was  an  "actual 
head  count  of  the  inhabitants  of  the  State,"  its  accuracy  was 
not  questioned,  and  it  was  closer  in  point  of  time. 

Without  citation  of  authority,  Meeks  v.  Avery,  supra,  at 
page  250,  suggests  that  the  constitutional  mandate  and 
legislative  enactments  requiring  enumeration  for  apportion- 
ment of  Representatives  relates  only  to  among  the  States, 
and  not  to  within  them. 

In  Exon  v.  Tiemann,  279  F  Supp  603,  608  (USDC  D. 
Neb.  1967),  the  court,  in  adjudicating  a  1961  congressional 
redistrict  plan  void,  suggested  that  estimates  of  the  1966 
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population  provided  by  the  Bureau  of  Business  Research  of 
the  University  of  Nebraska,  rather  than  the  1960  census 
figures,  were  more  valid: 

4 

We  do  not  intend  to  say  in  this  opinion  that  the  Bureau  of 
Business  Research  estimates  are  the  best  standard  to  use  if 
redistricting  is  attempted.  We  are  saying  that  better 
evidence  of  population  in  1967  is  available  than  the  blind 
use  of  the  1960  census. 

The  case  of  Dixon  v.  Hassler,  412  Supp  1036  (USDC  W.D. 
Tenn.)  1976,  on  review  of  a  1970  congressional  reappor- 
tionment statute,  is  the  most  recent  court  statement1  made 
setting  forth  guidelines  for  population  estimates  in  lieu  of  the 
decennial  census: 

We  have  reached  the  conclusion  that,  in  making  this 
reapportionment  ruling,  this  court  is  not  confined  as  a 
matter  of  law  to  the  1970  Federal  census  figures  and  that 
we  may  consider  the  estimates  tendered  by  the  Shelby 
County  Republican  Party.  Kirkpatrick  v.  Preisler,  394 
U.S.  526  at  535,  89  S.Ct.  1225  at  1231,  22  L.Ed.2d  519 
at  527  (1969)  and  the  concurring  opinion  at  537,  89  S.Ct. 
at  1232,  22  L.Ed. 2d  at  528,  so  indicate.2 
Although  we  have  determined  that  this  court  may 
consider  the  population  estimates,  before  considering  the 
validity  of  such  estimates  as  compared  with  the  1970 
Federal  census  figures,  we  must  first  determine,  as  a  legal 
proposition,  the  strength  that  the  evidence  supporting  the 
estimates  must  have  in  order  to  overcome  the  presumptive 
correctness  of  the  "head  count"  upon  which  the  1970 
Federal  census  was  based.  Exon  v.  Tiemann,  279  F.Supp. 
603  (D.Neb.1967),  while  holding  that  estimates  may  be 
considered,  does  not  really  deal  with  this  question. 
Kirkpatrick,  supra,  at  535,  89  S.Ct.  at  1231,  22  L.Ed.2d 
at  528,  indicates  that  the  estimates  must  have  a  "high 
degree  of  accuracy"  if  they  are  to  overcome  the  pre- 
sumptive correctness  of  the  prior  decennial  census.  While 
it  is  difficult  to  express  such  propositions  in  quantitative- 
qualitative  terms,  we  believe  that  the  standard  to  be 
applied  is  that  the  decennial  census  figures  will  be 
controlling  unless  there  is  "clear,  cogent,  and  convincing 
evidence"  that  they  are  no  longer  valid  and  that  other 
figures  are  valid.  This  is  a  test  that  lawyers  are  familiar 
with  and  are  accustomed  to  dealing  with. 

At  page  1041,  the  court  made  a  finding  that  the  Shelby 
County  Republican  Party  "has  not  presented  clear,  cogent, 
and  convincing  proof"  that  the  1970  Federal  census  figures 


1  Affirmed  on  Appeal  under  the  name  of  Republican  Party  of 
Shelby  County  v.  Dixon  et  at.  No.  76-65  429  U.S.  934;  50  L.Ed.2 
303 ;97  S.Ct.346  (November  8,  1976). 

2  By  so  ruling,  we  do  not  mean  to  say  that,  if  these  districts  had 
been  constitutionally  apportioned  following  the  1970  census,  they 
could  be  reapportioned  based  on  population  changes  prior  to  the 
1980  Federal  census. 


are  not  the  best  evidence  of  the  current  population  of  these 
districts  and  that  the  provisional  estimates  of  the  Bureau  of 
the  Census  are  the  best  evidence  of  such  population. 

A  case  relating  to  other  Federal  programs  and  benefits  and 
the  undercount  is  City  of  Camden  v.  Plotkin  (USDC  D.N.J.) 
(Oct  31,  1978)  466  F.Supp.  44,  51.  There  the  court  held 
that  the  individual  plaintiffs  had  standing  to  sue  on  alleged 
undercount  in  the  Camden  pretest,  which  would  adversely 
affect  the  city's  recognition  as  a  prime  sponsor  under  CETA. 

The  court  said: 

It  is  clear,  then,  that  the  plaintiffs  interest  in  a  federally- 
funded  job  program  is  within  the  "zone  of  interest" 
contemplated  by  Congress  when  it  mandated  census 
counts  between  each  decennial  census. 

The  population  estimates  at  issue  here  were  formulated  by 
defendants  pursuant  to  13  USC  Section  181  (Interim 
Current  Data).  The  pretest  was  undertaken  under  the 
authority  of  13  U.S.C.  Section  193  which  provides: 
"Preliminary  and  supplemental  statistics  .  .  .  ." 

In  the  1985  mid-decade  census  legislation,  Congress  took 
particular  pains  to  make  certain  that  the  mid-decade  census 
would  not  be  used  for  apportionment  of  Representatives  in 
Congress  "among"  the  several  States  or  "within"  the  several 
States. 

13  U.S.  Code  Section  141(e)  (2): 

Information  obtained  in  any  mid-decade  census  shall 
not  be  used  for  apportionment  of  Representatives  in 
Congress  among  the  several  States,  nor  shall  such 
information  be  used  in  prescribing  congressional 
districts. 

13  U.S.  Code  Section  195,  Use  of  Sampling: 

Except  for  the  determination  of  population  for  pur- 
poses of  apportionment  of  Representatives  in  Congress 
among  the  several  States,  the  Secretary  shall,  if  he 
considers  it  feasible,  authorize  the  use  of  the  statistical 
method  known  as  "sampling"  in  carrying  out  the 
provisions  of  this  title.  (October  17,  1976,  90  Stat. 
2464). 

Only  population  and  population  characteristics  data  ob- 
tained in  the  most  recent  decennial  census  could  be  used  by 
local  government  where  the  law  conferring  the  benefits  on 
local  governments  required  the  decennial  census  (section 
183(b)). 

ANALYSIS 

The  constraints  that  may  be  imposed  upon  utilization  of 
"sampling,"  estimates,  statistical  methods,  projections,  or 
trends  in  adjusting  the  census  undercount  are  limited  by 
legal,  political,  and  administrative  considerations. 
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Except  for  the  unambiguous  mandate  of  the  constitu- 
tional amendment  XIV,  section  2, 

Representatives  shall  be  apportioned  among  the  several 
States  according  to  their  respective  numbers,  counting  the 
whole  number  of  persons  in  each  State,  excluding  Indians 
not  taxed  .  .  ., 

it  appears  that  any  "sampling"  or  statistical  method  other 
than  direct  enumeration  could  be  used  for  congressional 
apportionment  within  States,  including  congressional  redis- 
ricting and  reapportionment  of  local  legislative  offices. 

The  methods  used  must  comport  with  the  constitutional 
requirements  which  provides  for  equal  representation  for 
equal  numbers  of  people,  permit  only  limited  population 
variances  which  are  unavoidable  despite  a  good  faith  effort  to 
achieve  absolute  equality,  or  for  which  justification  is  shown. 
Kirkpatrick  v.  Preisler,  supra,  page  525. 

On  the  reapportionment  of  Representatives  of  Congress, 
within  the  States  the  decennial  census  figures  must  be  used 
unless  there  is  "clear,  cogent,  and  convincing  evidence"  that 
they  are  no  longer  valid  and  that  other  figures  are  valid. 
Dixon  v.  Hassler,  supra,  1 040. 

It  should  be  remembered  that  there  is  no  articulated 
proscription  from  using  "sampling"  in  carrying  out  the 
provisions  of  this  title.  13  U.S.  Code,  section  195. 

If  "sampling"  includes  statistical  methods, which  embrace 


estimates,  projections,  trends,  and  adjustments  for  under- 
count,  and  this  "sampling"  passes  the  "clear,  cogent,  and 
convincing  evidence"  test,  there  is  no  legal  reason  to  suggest 
that  an  adjustment  for  undercount  could  be  made  contempo- 
raneously with  the  required  submission  of  the  decennial 
enumeration  data  to  the  President. 

This,  then,  would  allow  reapportionment  "within"  the 
several  States  to  proceed  based  upon  the  most  accurate  data; 
the  same  data  could  be  used  in  "other"  programs  or 
"benefits"  for  local  governmental  units.  13  U.S.  Code, 
section  183(b). 

All  federally  funded  programs  and  benefits  based  upon 
population  characteristics  can  be  subject  to  the  "sampling" 
provision  of  13  U.S.  Code,  section  195. 

The  Census  Bureau's  application  of  criteria  to  enumerate 
will  not  be  disturbed  unless  it  is  proven  that  the  Bureau 
failed  to  apply  the  proper  criteria  in  a  reasonable  manner  or 
its  application  lacked  a  rational  basis.  Borough  of  Bethel  Park 
v.  Stans  449  F  2d  575,  579  (1971)  (USCA   3rd  Cir  Penn.). 

CONCLUSION 

The  U.S.  Constitution,  case  law,  and  the  enactments  of 
the  Congress,  under  the  heretofore  specified  circumstances, 
would  permit  and  may  require,  if  feasible,  adjustment  of  the 
census  undercount  for  all  purposes  except  apportionment  of 
Representatives  of  Congress  "among"  the  several  States. 


Floor  Discussion 


The  group  agreed  that  a  strong  case  had  been  made  for 
adjusting  for  purposes  of  apportionment.  However,  the 
census  counts  have  to  be  given  to  the  Congress  in  January 
1981,  and  data  with  which  to  adjust  are  unlikely  by  that 
time.  Adjustment  under  such  time  constraints— except  for 
body  counts,  missing  questionnaires,  and  the  like— is 
difficult.  There  is  a  need  to  adjust  for  equity,  business  uses, 
and  the  social  sciences,  although  the  case  for  business  may 
not  be  too  strong.  Having  two  sets  of  figures  should  be  no 
problem;  the  Census  Bureau  constantly  adjusts  retail  sales 
data,  for  example,  and  the  consumer  price  index  is  adjusted. 
The  marketing  field  needs  accurate  figures  for  small  areas. 
Although  a  2-percent  error  here  is  acceptable,  10  to  15 
percent  would  be  too  large.  A  2-  to  3-percent  undercount  has 
been  mentioned  frequently.  There  was  some  concern  should 
it  reach  5  to  10  percent,  however.  This  is  not  inconceivable 
should  something  like  the  proposed  selective  service  registra- 
tion suddenly  become  linked  with  the  census  in  people's 
thinking.  The  Bureau  must  decide  within  3  to  4  months  what 
type  of  adjustment  it  will  make,  possibly  with  a  minimum 
cutoff  for  areas  with  insignificant  adjustments.  Further,  the 
adjustment    procedure    must    be    statistically    and    legally 

defensible. 

There  is  no  need  for  a  final  decision  to  come  out  of  this 

conference  nor  is  there  need  for  a  vote.  Rather,  the  Bureau 

should  have  a  sense  of  the  meeting,  which  is  that  no  matter 

what  procedure  is  selected,  research  should  not  be  reduced. 

Secondly,  if  an  adjustment  by  age,  sex,  and  race  is  chosen, 

this  might  affect  per  capita  income  data,  but  not  necessarily 

reduce    the    undercounts    for    the    age    groups    that    are 

responsible  for  the  income.  In  fact,  the  reverse  might  be  true. 

Whether  income  would  be  adjusted  or  fixed  would  depend 

on  the  way  the  adjustment  was  carried  out.  It  could  affect  all 

of  the  other,  data  (employment,  etc.),  or  the  characteristics 

could  be  based  on  the  unadjusted  figures. 

It  was  noted  that  since  political  representation  is  at  the 

heart    of    the    issue,    adjustments    for    minorities    are   an 

important  concern.   Nine   of  the    10  largest  congressional 

districts  that  are  losing  population  are  predominantly  black 

and  have  black  Representatives.  These  districts  are  also  the 

ones  where  the  undercount  is  the  highest  and  may  be  those 

hit    hardest   by   reapportionment,   while  the  areas  gaining 

population   have  fewer  blacks.  The  apportionment  figures 

supplied  to  the  Congress  in  January  1981  will  not  reflect  the 

undercount,  yet  the  need  for  adjustment  is  clear.  With  the 

history    of  black  and  other   minority  disenf ranch isement, 

using  unadjusted  counts  will  constitute  a  "new  disenfran- 

chisement."  This  conference  should  suggest  solutions  to  the 


technical  details  and  move  toward  adjustment  for  apportion- 
ment purposes.  Adjustment  down  to  the  block  level  was  felt 
to  be  desirable,  but  the  procedures  for  doing  this  will  require 
further  research. 

A  question  was  raised  on  adjustments  for  Asians  and 
Pacific  Islanders.  The  Bureau  indicated  that  counts  will  be 
reported  for  blacks  and  races  other  than  white.  There  will  be 
an  undercount  rate  as  well  as  a  figure  for  blacks,  so  the 
residual  (however  unreliable)  could  be  used  as  an  adjustment 
for  other  races.  The  adjustment  for  the  residual  would 
increase  in  areas  with  fewer  blacks,  such  as  Hawaii.  It  was 
questioned  that  since  no  research  on  good  ways  to  apply  the 
synthetic  method  to  Hispanics  and  illegal  aliens  has  been 
completed,  what  should  be  done  in  the  absence  of  a  clear 
method?  Of  course,  the  Census  Bureau  will  be  the  "expert" 
called  in  court  cases.  It  was  felt  that  if  the  Bureau  knows 
what  ought  to  be  done,  it  should  indicate  what  procedures 
will  be  followed  and  not  leave  the  issue  to  the  courts  and  the 

legislative  branch  to  decide. 

There  appeared  to  be  little  consensus  about  what  should 

be  done  or  who  should  do  it,  however.  The  timing  of  any 

adjustments  is  even  more  problematic.  The  distribution  of 

funds  is  an  ongoing  process,  so  numbers  can  be  adjusted  for 

that  use;  but  the  distribution  of  seats  is  discrete  and  cannot 

be  changed  the  following  year.  When  small-area  data  become 

available  (by  April  1,  1981),  nine  States  will  have  already 

passed    their    redistricting    deadlines    for    apportionment. 

Thirty-four  States  must  finish  by  the  end  of  1981,  and  only 

six  can  wait  until  1982.  This  is  a  tight  timetable,  especially 

for  the  8  to  10  States  that  the  Census  Bureau  estimates  will 

either  gain  or  lose  seats  as  a  result  of  the  1980  census. 

The  idea  of  adjusting  for  reapportionment  is  an  example 

of  the  thesis  that  "adjustment  begets  more  adjustment"  and 

suggests     a    further    adjustment    problem.    Congressional 

redistricting  pays  no  attention  to  municipal  boundaries.  This 

implies  that  it  will  be  necessary  to  adjust  the  counts  for  4 

million  blocks.  In  fact,  if  figures  other  than  the  initial  counts 

are  used,  reapportioning  State  legislatures  could  occur  every 

2   years   on    the   basis   of  perfectly  good   revenue-sharing 

estimates. 

The  failure  to  use  adjusted  figures  for  apportionment  does 

not  violate  the  doctrine  of  equal  protection  because  Article  I 

and  Amendment  XIV  of  the  Constitution  both  specify  the 

counting  of  all  inhabitants,  and  apportionment  for  the  435 

House  seats  is  determined  by  that  enumeration. 

It  was  observed  that  a  10-percent  margin  in  a  redistricting 

plan  does  not  have  to  be  justified,  but  the  margin  can  go 

higher  if  justified  (an  example  of  this  is  the  16.4  percent  in  a 
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Virginia  case).  It  was  reemphasized  that  using  adjusted 
figures  for  apportionment  among  States  has  never  been  raised 
in  the  courts.  The  litigation  has  been  addressed  to  within- 
State  redistricting.  The  use  of  a  10-percent  margin  might 
dilute  equal  rights  and  violate  the  Voting  Rights  Act, 
however.  Any  margin,  however  small,  would  be  too  much  if 
it  deprives  people  of  their  rights.  The  courts  will  ask,  in  such 
cases,  how  "finely  tuned"  the  census  figures  are.  The  Bureau 
imputes  figures  for  "closeout"  cases.  If  the  Constitution 
limits  apportionment  among  the  States  to  the  actual  count, 
but  adjusted  data  can  be  used  within  the  States  for 
redistricting,  then  the  State  figures  will  sum  to  a  higher 
national  total.  However,  there  still  are  only  435  seats  to  be 
distributed. 


Should  the  adjustment  decision  be  made  by  the  courts 
instead  of  the  Bureau,  it  was  thought  that  the  effect  on 
professional  credibility  and  morale  in  the  Census  Bureau 
might  be  negative.  It  might  also  negatively  impact  the  court's 
reputation.  If  the  courts  decide  on  the  adjustment  rather 
than  the  Census  Bureau,  it  will  be  harder  to  revise  the 
methodology  in  the  future.  The  Bureau  did  not  practice 
imputation,  as  such,  prior  to  the  1960  census;  the  court 
decided  in  East  Chicago  vs.  Stans  that  the  Bureau  had  acted 
within  the  meaning  of  the  Constitution  in  carrying  out  the 
1970  census  (in  which  there  was  imputation).  This  would 
appear  to  be  at  the  edge  of  conformity  to  the  law,  however, 
and  imputation  beyond  the  bounds  of  1970  would  be 
risky. 


Considerations 
of  Statistical 
Equity 


Should  the  Census  Count  Be  Adjusted  for 
Allocation  Purposes  -  Equity  Considerations 


I.  P.  Fellegi 

Statistics  Canada 


INTRODUCTION 


A  MEASURE  OF  INEQUITY 


This  paper  examines  a  very  special  kind  of  census  data 
use:  Its  legislated  utilization  as  input  to  formulas  on  the 
basis  of  which  funds  are  allocated  from  one  level  of  govern- 
ment to  another.  To  the  extent  that  the  census  counts  are 
subject  to  underenumeration,  their  use  for  this  purpose 
represents  a  deviation  from  the  legislated  intent  that 
(implicitly)  assumes  the  counts  to  be  free  of  error. 

Estimates  of  the  undercount  are  often  available  as  part 
of  the  evaluation  of  the  census.  These  estimates  are  them- 
selves subject  to  sampling  and  other  errors.  Should  they 
nevertheless  be  used  to  adjust  the  census  counts  for  the 
purpose  of  legislated  intergovernmental  allocation  of  funds? 
The  present  paper  concentrates  on  the  narrow  question  of 
adjustment  for  this  single  purpose.  However,  it  is  important 
to  keep  in  mind  that  in  the  real  world  of  statistical  policy 
a  number  of  other  questions  must  also  be  answered  before 
deciding  on  a  specific  course  of  action:  If  the  counts  are 
adjusted  for  one  purpose  (but  not  others),  can  users  cope 
with  more  than  one  "official"  set  of  population  figures? 
Should  intercensal  population  estimates  also  be  adjusted? 
Should  current  surveys  use  adjusted  intercensal  estimates 
in  their  ratio  estimation  (some  of  the  current  surveys  have 
themselves  formula  allocation  uses)?  What  would  be  the 
impact  on  electoral  redistricting  (another  legislated  use  of 
census  data  which,  however,  requires  considerably  more 
geographic  detail— at  least  in  Canada— than  the  counts  needed 
for  intergovernmental  fund  allocations)?  None  of  these  related 
questions  are  examined  in  this  paper.  Even  within  the  narrow 
context  of  a  single  application,  i.e.,  legislated  allocation  of 
funds,  there  are  several  issues  which  have  to  be  considered: 
The  intent  of  the  legislation;  the  danger  of  "politicizing" 
statistics  or,  more  precisely,  whether  the  danger  of  political 
pressures  on  the  census  increase  or  decrease  when  the  counts 
are  adjusted  for  underenumeration;  and  the  long-term  feed- 
back effect  identified  by  Nisselson  [3] ,  i.e.,  whether  adjust- 
ing the  count  diminishes  the  incentive,  particularly  for 
minority  groups,  to  work  with  the  statistical  office  to 
improve  the  census  the  next  time  around.  Again,  in  this 
paper,  most  of  these  considerations  will  be  largely  set  aside, 
concentrating  on  the  notion  of  legislative  intent  or  "equity." 

Finally,  allocation  formulas  seldom  use  only  the  census 
as  their  data  source.  However,  for  the  sake  of  simplicity,  we 
will  examine  only  the  impact  of  errors  in  the  census  counts 
on  fund  allocations. 


Formula  allocations  of  funds  from  one  level  of  govern- 
ment to  another  take  a  variety  of  forms.  Many  such  payments 
can  be  broadly  characterized  as  follows: 

(a)  The  national  government  (Federal,  for  the  sake  of 
specificity)  provides  funds  directly  to  the  next  level 
of  government  (provinces,  to  be  specific); 

(b)  The  legislation  implies  explicitly  or  implicitly  the  cal- 
culation of  a  per  capita  amount  in  province  /',  say 
X-.  Then  the  total  payment  intended  for  province  / 
is  calculated  as 


Tf-PfX, 


(1) 


where  P-  is  the  total  population  of  province  /,  or  that  of  a 
subgroup  of  the  province,  such  as  the  number  of  univer- 
sity students,  or  the  number  of  persons  below  the  poverty 
line.  The  quantity  X,  may  depend  on  P-. 

In  the  presence  of  some  underenumeration,  the  quantity 
Pj  is  estimated  as  p-.  Applying  the  legislated  allocation  for- 
mula, but  using  the  known  quantities  p-  instead  of  the 
unknown  P-,  the  quantity  T-  would  be  estimated  as  t-.  There- 
fore, the  realized  per  capita  payment  in  province  /  is  no 
longer  X-  but  rather 


xrtllPl 


(2) 


Note  that  in  (2)  above,  the  denominator  is  Pj,  not  p-,  since 
the  actual  population  in  province  /  is  Pj,  not  Pj,  so  the  de 
facto  per  capita  payment  in  a  province  is  equal  to  the  actual 
payment  (as  computed  using  the  census  estimates),  divided 
by  the  actual  total  (or  subgroup)  population  of  the  province. 

The  per  capita  deviation  between  the  amount  intended 
by  legislation  and  the  amount  actually  received  is 


dj  =  Xj-Xj 


(3) 


We  shall  define  the  notion  of  equity  as  a  numerical  measure 
of  the  extent  to  which  legislative  intent  is  complied  with. 
Thus,  this  paper  proposes  as  the  measure  of  inequity  due  to 
the  census  undercount  the  square  of  the  deviations  d.-  aver- 
aged over  the  total  population  (or  subgroup) 
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/  =  2  P.  df  I P 


(4) 


P.-  p. 

U;  =  — 

'        P: 


(6) 


where 


/>  =  2  />■ 


we  get 


fi=c2  ^P-,uf  IP 


(7) 


Another  way  of  interpreting  (4)  would  be  to  think  of  d- 
as  the  per  capita  underpayment  or  overpayment  received  by 
a  province;  in  which  case  /  is  the  weighted  average  over  all 
provinces  of  the  square  of  the  per  capita  underpayments  or 
overpayments,  the  weights  being  the  provincial  populations 
(or  applicable  subpopulation  counts).  Indeed 


df  =  Xrxr(Trti)/Pi 


(5) 


Note  that  /  does  not  measure  only  the  extent  to  which 
provinces  are  "shortchanged";  overpayments  or  under- 
payments are  given  equal  weight  in  that  they  both  have  the 
same  effect  on  the  legislated  intent  of  equity.  /  will  be  used 
as  a  measure  to  determine  whether  the  adjusted  or  unadjusted 
census  counts  lead  to  less  inequality.  It  is  similar  to  the 
measure  proposed  by  Jabine  [2] . 

TWO  TYPES  OF  ALLOCATIONS 

For  the  sake  of  simplicity  or  presentation,  we  must 
simplify  the  large  variety  of  allocation  formulas  actually 
used.  We  will  focus  on  two  prototype,  or  model,  formulas. 

Fixed  Per  Capita  Payment 

According  to  this  formula,  the  amount  paid  to  a  province 
is  directly  proportional  to  the  number  of  persons  in  the 
target  group:  if  c  is  the  intended  per  capita  payment,  then 


Tj=   PjC 


A  reasonable  approximation  of  this  allocation  would  be  that 
legislated  by  the  U.S.  Elementary  and  Secondary  Education 
Act,  as  described  in  the  "Report  on  Statistics  for  Allocation 
of  Funds,"  published  by  the  Office  of  Federal  Statistical 
Policy  and  Standards  [4] . 

Note  that  since  equity  was  defined  as  compliance  with 
legislative  intent,  /j^O  even  if  u.-  is  a  constant  over  all  pro- 
vinces. In  this  case  every  province  is  shortchanged  by  the 
same  per  capita  amount.  So  while  there  is  no  differential 
shortchanging  of  provinces,  the  legislative  intent  is  violated 
to  the  extent  that  the  de  facto  per  capita  payment  differs 
from  the  legislated  one. 

Fixed  Total  Payments 

Under  this  model,  the  Federal  Government  distributes  to 
provinces  a  fixed  amount,  the  total  received  by  a  given 
province  being  proportional  to  its  population.  Then,  if  C  is 
the  total  to  be  distributed, 


T;       = 


I 


X:      = 


X;       = 


1 


2P 


C 


tj  =   Pjc 


Pf  -LPj 


Xj=  c 


xrPic/Pj 


p.- Pi 

d;  =  -L 1  c 

1       P. 


/l    =      (C2    IP)     2     P:     ( 


P.-  O. 


di 


=      ( 


2P-       P-  2p. 


)  C 


It  is  easy  to  see,  using  the  notation  of  (6),  that  the  corre- 
sponding inequity  measure,  /2,  becomes 


where 


h   = 


T.\2 


P*    (1-t7)  2  / 


ZPfiUj-u) 


(8) 


Denoting  by  u-  the  proportionate  underenumeration  in 
province  /  (which,  of  course,  may  theoretically  be  negative 
in  the  case  of  overenumeration) 


u  =  2PiuiIP 


(9) 


A  reasonable  approximation  of  this  model  occurs  in  the 
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Canadian  Fiscal  Arrangements  Act.  According  to  this  Act, 
the  revenue  capacity  of  each  province  is  quantified  in  a 
fashion  independent  of  the  census.  Let  this  measure  be  M- 
for  the  /th  province.  Then  province  /  receives  the  amount 


-£  2  M,  -  M. 
i 

when  the  amount  above  is  positive,  and  receives  nothing  if 
the  amount  is  negative.  Since  M-  is  independent  of  the  popu- 
lation count  P-,  the  effect  of  underenumeration  on  receiving 
provinces  can  be  studied  within  the  fixed  total  payment 
model— in  either  case,  the  critical  factor  is  the  proportion  of 
the  total  population  residing  in  a  province,  as  opposed  to 
the  absolute  number  in  the  fixed  per  capita  payment  case. 

TESTS  APPLICABLE  TO  THE  FIXED  PER 
CAPITA  PAYMENT  MODEL 


A/,  —(/,  -  /i  )  =XPiui2-XPi(ui     ,  ,, 


r>\2 


(13) 


The  expression  (13)  above  may  not  be  estimable  directly  if 
no  unbiased  estimates  of  Uj  exist.  However,  after  some 
manipulation,  we  obtain  using  the  notation  of  (1 1) 


A/,  =  2/>,<?2 


*hW"™wfli 


(14) 


All  terms  of  (14)  are  estimable  except  the  last  one.  The  last 
term  is  equal  to  zero  if  the  estimates  of  underenumeration 
are  unbiased,  i.e.,  if  ^-=0  for  all  /'.  If  this  is  not  the  case,  a 
simplifying  assumption  is  needed. 

Assumption  A:  If  the  estimates  u-  of  underenumeration 
have  a  non-negligible  bias,  assume  that 


I,Pb-u->0 


We  assume  that  estimates  of  the  undercount  proportions 
Uj  are  available  which  can  potentially  be  used  to  adjust  the 
census  counts.  Let 


U: 


be  the  available  estimate  of  u;.  Let 


E  &,)  =  e, 


so  that 


b-  =e- 
/        / 


U: 


(10) 


(11) 


is  the  bias  of  the  estimates.  We  also  assume  that  the  popula- 
tion counts  Pj  have  a  negligible  variance. 

If  we  were  to  adjust  the  population  estimates,  the  adjusted 
counts  would  have  a  residual  underenumeration  (which 
could  be  negative)  equal  to  u-  -  u.  So  the  measure  of  in- 
equity, if  the  adjusted  census  counts  are  used  for  the  fund 
allocation, can  be  obtained  from  (7)  by  substituting  u--  u- 
for  u:: 


l 


=  c2   2  Pjbjj-Uj)2  IP 


(12) 


Ideally,  one  would  like  to  find  an  adjustment  that  minimizes 
l\.  This  is  unlikely  to  be  possible.  A  more  modest  but  still 
very  relevant  objective  is  to  find  an  adjustment  that  reduces 
the  inequity,  i.e.,  for  which  the  difference/j  -  /j^is  positive: 


5/i='i-'T=f  &piui 


*Pj(Uj 


V,  )2 1 


Leaving  out  the  positive  factor  in  front  of  the  brackets,  we 
will  examine 


Assumption  A  is  likely  to  be  satisfied  since  the  estimates 
U:  are  typically  positive  and  are  usually  underestimates  of 
the  unknown  underenumeration  Uj  (i.e.,  ^=0).  Even  if  the 
quantities  b-  are  not  negative  for  all  /,  assumption  A  is  satis- 
fied if  the  estimated  underenumeration  tends  to  be  low 
U>50)  for  those  subgroups  which  the  census  finds  difficult 
to  enumerate  (even  if  bf>0  for  groups  which  the  census 
finds  easier  to  enumerate).  In  other  words,  assumption  A 
is  likely  to  be  satisfied  if  high  values  of  u-  are  accompanied 
by  negative  values  of  b-,  even  if  low  values  of  u-  are  accom- 
panied by  positive  b-.  Of  course,  assumption  A  can  always 
be  satisfied  if  a  sufficiently  conservative  set  of  estimates 
u-  is  used.  Most  methods  used  in  practice  to  provide  estimates 
of  Uj  are  likely  to  satisfy  the  assumption:  Postenumeration 
surveys,  dual  method  estimates,  and  reverse  record  checks 
(see  later  section  for  a  brief  discussion  of  the  latter).  The  as- 
sumption may  not  hold  for  so-called  analytic  estimates  of 
u.% 


Denote 


A/;  =  ZPje2  -ZPj(Uj-ej): 


Under  assumption  A 


a/i  >  a/; 


We  will  construct  a  test  of  the  positivity  of  A/J ,  which  will 
therefore  serve  as  a  conservative  test  of  the  positivity  of 
A/t. 

In  order  to  estimate  the  first  two  terms  of  A/{,  we  note 
that  if  var  u-  is  an  unbiased  estimate  of  Var  Uj,  i.e.,  if 


E  (var  U-)  =  Var  a- 


(15) 


then 
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E  (uf  -  2  var  uj)  =  ef  -  2  Var  uf 


so  that 


f>2 


A/i  =  2  />,-  (u: 


2  var  A; 


(16) 


(17) 


would  be  an  estimator  of  A/'  if  the  quantities  Pf-  were 
known.1  However,  since  their  role  is  only  to  provide  weights 
to  the  averaging  process,  (17)  is  not  likely  to  be  sensitive  to 
even  reasonably  significant  variation  in  the  weights  P-.  There- 
fore in  place  of  P-  we  can  use  census  population  counts  p-(,  or 
alternatively  the  adjusted  census  estimates  Pf  =P/  (l+u^.We 
get  our  first  test: 


Test  1 :  Adjust  the  census  counts  if 


Xp;®]  -2vart//)>0 


(18) 


A  more  conservative  test  can  be  constructed  (and  one  that 
does  not  require  the  values  PA  as  follows: 

Test  2 :  Adjust  the  census  counts  if  for  all  i 


u)-2varuj>Q 


(19) 


Inequality  (19)  is,  of  course,  equivalent  to  the  condition 
that  the  estimated  rel-variance  of  all  provincial  underenu- 
meration  rates  should  be  less  than  or  equal  to  1/2.  It  can 
also  be  looked  upon  as  a  form  of  significance  test  for  the 
hypothesis  of  E  (Uj)=0. 

Test  2  is  suggestive  of  an  alternative  approach  whereby 
those  provincial  populations  for  which  (19)  is  positive  would 
be  adjusted,  leaving  the  remaining  ones  unadjusted.  If  st  is 
the  set  of  provinces  for  which  (19)  is  positive  and  s2  is  the 
set  of  all  other  provinces,  this  alternative  adjustment  strategy 
results  in  the  following  value  of  the  change  in  inequity: 


2     p.  (u)  -  2  var  0,) 
ies! 


which  is,  of  course,  positive. 

The  application  of  test  1  (or  the  stronger  test  2)  does  not 
guarantee  that  the  adjusted  census  counts  will,  in  fact,  reduce 
the  inequity  of  allocation,  since  the  inequity  measures  them- 
selves are  based  on  sample  estimates.  All  one  can  really  say 
is  that,  if  the  sample  size  is  large  enough  so  that  the  sampling 
distribution  of  the  left  hand  side  of  (18)  is  reasonably  sym- 
metric, it  is  more  likely  that  the  adjusted  counts  will  result 
in  less  inequity  rather  than  the  unadjusted  ones.  Put  differ- 
ently, if  there  are  no  penalties  attached  to  adjusting  the 
counts,  one  could  certainly  use  test  1  or  test  2  as  a  suffici- 
ent condition  of  adjusting.  A  much  more  conservative  test 
results   if  the   basic   strategy   is  to  adjust   only  when  the 


1  It  should  be  noted  that  A//  is  itself  based  on  sample  estimates 
so  the  expected  value  of  A/,'  is  not  A/i.  However,  E  (A/,'—  A/J) 
=  0. 


evidence,  in  some  sense,  is  overwhelming  that  the  adjusted 
estimates  would  reduce  inequity.  Under  this  strategy  one 
would  adjust  the  estimates  only  if  one  were  reasonably 
certain  that  the  unadjusted  estimates  lead  to  a  higher  measure 
of  inequity. 

Normally,  if  one  wanted  to  construct  a  test  for  the 
positivity  of  A/'i ,  one  would  construct  a  test  statistic  based 
on  its  estimate  A/{  and  the  standard  deviation  of  the  latter. 
However,  since  A/{  itself  is  a  mixed  expression  involving 
both  parameters  (eA  and  sample  estimates  (0j),  that  approach 
would  lead  to  a  test  of  the  positivity  of  the  common  ex- 
pected value  of  A/i  and  A?i ,  not  of  A/{  itself.  We  will  there- 
fore consider  the  standard  deviation  of  A/'  -  AT{ .  Let 

d  =  estimated  standard  deviation  of  A/'i  -  A/i 

Then  we  have,  under  the  usual  assumption  of  approximate 
normality, 


E  (A//  -  A/i )  =  0 
Prob  (A/{  -  A/i  >  -  2d)  =  0.975 
Thus  /7we  also  have 

A? i  >  2  d 
then  from  (21 )  and  (22)  it  follows  that 
Prob  (A//  >0)  =  0.975 


(20) 
(21) 

(22) 

(23) 


Therefore  (22)  provides  the  desirable  test  of  (23).  We  need 
to  estimate,  however,  the  standard  deviation  of  A/i  -  A?/. 


A/J  -  A//  =  2ZPf  (epj  -  uf  +  var  Qf) 


(24) 


Now,  it  is  easy  to  show  that  up  to  terms  of  order  Mn- 


(where  n-  is  the  sample  size  in  province  /') 


Var  (e-  u-  -  uj  +  var  u-)  =  ef   Var  u- 


(25) 


which,  to  the  same  order  of  approximation,  is  estimated  by 


^2  A 

Uj  var  Uj 


Also,  to  the  same  order  of  approximation,  it  can  be  shown 
that 

Cov  (epj  -uj  +  var  Uj,  epj  -  2?  +  var  u) 


=e.e.  Cov  (Uj,  uj) 


(26) 


so  that  the  sign  of  the  covariance  terms  in  the  variance  of 
(24)  is  equal  to  those  of  Cov  [&•,  uj).  We  now  make  the 
following  assumption. 
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Assumption  B:  Assume  that  the  estimates  u-  are  either 
independent  or  not  positively  correlated  with  one  another. 

Under  assumption  B  an  overestimate  of  the  variance  of 
A/i  -  A?I  is  obtained  by 


4  2  Pj  uj  var  u/ 


So,  we  obtain  the  following  test: 
Test  3 :  Adjust  the  census  counts  if 


Zp/  (u)  -  2  var  Uj)  -  4  yj  2p?  uj  var  u.   >  0  (27) 

If  test  3  is  satisfied,  then  under  assumptions  A  and  B  the 
odds  are  at  least  0.975  that  the  adjusted  counts  result  in  less 
inequity  than  the  unadjusted  ones. 

It  should  be  noted  that  assumption  B  is  satisfied  (in  fact, 
the  terms  of  (24)  are  independent)  in  the  case  of  postenumer- 
ation  surveys  whose  stratum  boundaries  respect  provincial 
boundaries.  It  is  likely  to  be  satisfied  in  the  sense  of  non- 
positive  correlations  between  0/  if  the  provincial  estimates 
U:  are  based  on  domain  estimators  (since  the  sum  of  the 
variances  of  estimates  prepared  for  different  domains  is 
usually  larger  than  the  variance  of  the  estimate  prepared 
for  the  union  of  the  domains).  At  any  rate,  assumption  B 
is  not  nearly  as  important  as  assumption  A,  since  the 
covariance  terms  are  likely  to  be  very  small  so  long  as  the 
sampling  ratios  used  in  the  survey  to  estimate  u-  are  small. 
In  fact,  they  are  of  the  order  MP,  so  that  neglecting  them 
is  equivalent  to  neglecting  the  finite  population  correction. 

Nevertheless,  if  assumption  B  cannot  be  accepted,  an 
even  more  conservative  test  results  if  the  principle  applied 
to  the  right  hand  side  of  (18)  is  applied  separately  to  every 
term  there.  We  then  obtain 

Test  4:  Adjust  the  census  counts  if  for  all  i 


(with  obvious  modifications)  if  the  fixed  per  capita  payment 
c  is  constant  within  each  province  as  opposed  to  its  being 
constant  for  the  whole  country.  In  other  words,  c  can  be 
replaced  by  a  set  of  constants  Cj,  as  long  as  each  of  them  are 
determined  without  reference  to  the  census  population 
estimates  p-.  This  is,  in  fact,  the  case  with  the  U.S.  Ele- 
mentary and  Secondary  Education  Act  mentioned  above. 
Even  more  generally,  much  the  same  results  and  derivations 
apply  (with  P-  or  p-  replaced,  respectively,  by  cP-  and 
C:Pj)  under  the  allocation  formula 


T-  =  cP+b- 
i       ii       i 


where  c-  and  b-  are  provincial  constants  that  do  not  depend 
on  P-  or  Pj. 

TESTS  APPLICABLE  TO  THE  FIXED  TOTAL- 
COST  MODEL 

If  we  were  to  adjust  the  census  /counts  for  the  present 
allocation  model,  the  resulting  inequity  measure  would  be 
obtained  from  (8)  by  substituting  Uj  -  Uj  for  (/■: 


iA=  \ri  /p3  m-TT+Sj2 


.  —   it .  —   #/4-/7\2 


l?=  [C2  IP5  (1-u+u)2]  T,P-Aur  ur  u-fiV 

i 


where 


Now  if 


U=ZPjUjlP 


i7=0 


which  should  certainly  hold,  we  get 


-lA- 


P3  (1-Z7)2 


2  Pj  (Uj-u)7 


Vj-  2  var 


V4"/V 


var  U; 


=  0 


(28) 


Based  on  test  4,  a  conservative  adjustment  strategy  might 
involve  adjusting  those  provincial  census  counts  only  for 
which  the  left  hand  side  of  (28)  is  positive. 

In  conclusion,  two  points  may  be  noted.  First,  if  the 
estimates  f/.-  are  not  based  on  sample  data  (such  as  in  the 
case  of  analytic  estimates),  then  S/=e.-,  Var  ^-=0,  hence  from 
(14)  a  condition  for  the  adjusted  counts  to  result  in  lower 
inequity  is 


VPjG)-  2  2p/e//>/  =  0 


(29) 


for  which  assumption  A  provides  a  sufficient  (though  clearly 
not  necessary)  condition. 

Second,  the  entire  development  of  this  section  is  valid 


C2  v  d  /,.  _rt  _  <*L.r7\2 


P*  (1-J7+J7) 


t-*2  ZPjiUj-Bf-u+nr 


-  ,  c'    h  Pj  (Uj  -  u)2  -  vpj  (Uj  -  n..  -  s-fS)2] 

P3  (1-t7)2  L     '     '  /     /      /  J 

(30) 

Proceeding  in  a  way  analogous  to  the  previous  section,  we 
denote 


n  _  7Tj.r;\2 


A/2  =  2  Pj  (Uj  -  u)2  -  S  Pj  (Uj  -  Uj  -  u+uf 
After  some  manipulation,  we  obtain 

A/2  =  2  Pj  (e,-  -  e)2  -  2  Pj  (uf  -  u-e}  +* )2 

-2T,Pj(bj-l)(Uj-%  (31) 
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As  in  the  previous  section,  the  last  term  is  zero  if  the  quan- 
tities 0-  are  unbiased  estimates  of  u-.  Failing  that,  we  need 
an  assumption  analogous  to  assumption  A. 

Assumption  C:   If  the  estimates  0-  have  a  non-negligible 
bias,  assume  that 


ZPf{bf-b)(0f-u)^O 


Assumption  C  is  satisfied  if  it  is  predominantly  true  that 
wherever  the  estimated  underenumeration  is  above  average, 
its  measurement  is  also  worse  than  average  (b-  <  5),  and 
conversely.  This  is  likely  to  be  the  case  at  least  for  most 
survey-derived  estimates  of  underenumeration:  people  whom 
the  census  finds  most  difficult  to  count  are,  typically,  also 
more  difficult  to  enumerate  in  the  evaluation  survey.  Note 
that  assumption  C  is  likely  to  be  somewhat  stronger  than 
assumption  A  of  the  previous  section,  since  the  former  is 
equivalent  to 

XPfbfOf^Pbu 

where  the  right-hand  side  is  very  likely  to  be  nonpositive; 
whereas  assumption  A  only  requires  that  the  left-hand  side 
be  nonpositive. 

Under  assumption  C  we  readily  obtain 

A  l2  >  A  l\  =  2  Pf  {fij  -  ef  -  2  Pf  (Of  -u-ef  +e)2 

=  ^P,ef-   2  Pf  (tij-ef)2  -Pe2  +P  (u-e)2   (32) 

All  the  quantities  on  the  right-hand  side  of  (32)  are  estimable 
unbiasedly,  except  P-.  As  before,  if  one  accepts  the  biased 
estimates  p-  as  serviceable  (their  role  being  that  of  a  set  of 
weights),  we  get  our  first  test: 

Test  1 :  Adjust  the  census  counts  if 

A?;  =  2  Pf  (uj  -  2  var  u.)  -  pu2  +2p  var  u 


in  _  £\2 


£,  > 


=  2  Pf   [ (Of  -  u)2  -  2  var  u{  +  2  var  u]  =  0      (33) 

A  test  analogous  to  test  2  of  the  previous  section  could 
immediately  be  deduced,  but  it  would  not  be  useful:  In  the 
present  case,  it  is  not  valid  to  contemplate  adjusting  the 
census  count  of  only  some  of  the  provinces,  since  the  pay- 
ment to  any  province  depends  on  the  population  of  all 
provinces. 

The  test  above  is  designed  to  ensure  that  the  census  counts 
are  adjusted  if  the  adjusted  counts  are  more  likely  to  lead  to 
a  more  equitable  allocation  of  funds.  As  in  the  previous 
section,  we  can  construct  a  test  that  will  lead  to  an  adjust- 
ment only  if  the  adjusted  counts  are  almost  certain  to  lead 
to  a  more  equitable  allocation.  In  order  to  do  so,  we  need 
to  derive  the  variance  A/2  -  A/j ,  where 


A?2'  =  2  Pf   [  (Of  -  ft)2  -  2  var  u.  +  2  var  u] 
It  is  easy  to  verify  that 


A/2  -  A%  =  2  2  Pf  [ef  Of  -  Of  +  u2  -  eu 
+  var  Uf-  vart/J 


(34) 


The  following  formula  can  be  obtained  after  some  algebra, 
which  is  correct  to  order  1/n 

Var  (A/2  -  A/a )  =  4  2  Pf  \e)  Var  0}  +  u2  Var  D 


-  2  e,-  e  Cov  (ujf  u)]  +  2  P.  P-    ef  e- 
•  Cov  (Of ,  Oj)  -  ej  e  Cov  (uf ,  U) 


-  eje  Cov  (Uj ,u)  +e    Var  u] 


=  4Var[XPf(efUf-eu)  ] 


=  4  Var  [XPj(ej-e)Uj] 

=  4  2P/2(e/-e)2  Vara,- 

+  42^-  Pj  (ej  -  e)  (ey-  -  e)  Cov  (Of ,  Oj  ) 
m   '  (35) 


The  sign  of  the  last  term  of  (35)  is  not  easy  to  guess.  If 
the  estimates  0-  were  based  on  simple  random  sampling, 
Cov  (0.;  0-)  would  be  negative  and  proportional  to  ee-.  In 
that  case,  it  is  easy  to  verify  that  the  last  term  of  (35)  is 
negative.  In  the  case  of  more  complex  designs, this  will  still 
hold  if  the  design  effects  applicable  to  the  covariances  are 
reasonably  equal.  In  general,  the  negativity  of  this  term 
cannot  be  proven  although  likely  to  be  true.  If  the  last  term 
is  negative,  dropping  it  would  lead  to  a  conservative  estimate 
of  the  variance.  At  any  rate,  for  reasonable  values  of  the 
sampling  ratio,  the  covariances  will  be  of  the  order  of  1/P, 
thus  neglecting  them  is  analogous  to  dropping  the  finite 
population  correction.  Now,  proceeding  as  in  the  previous 
section,  we  get  the  following  test. 

Test  2:  Adjust  the  census  counts  if 

Zpj  [  (Oj  -  u)2  -  2  var  0f  +  2  var  u] 


4  \/2pJ  (Oj-  u)2  varOj>  0 


(36) 


If  test  2  is  satisfied,  then  under  assumption  C  and  given 
moderate  sampling  ratios  in  all  provinces,  the  odds  are  at 
least  0.975  that  the  adjusted  counts  result  in  less  inequity 
than  the  unadjusted  ones. 
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Finally,  we  may  note  that  if  u-  are  not  sample  estimates, 
Uj  =  e-  Var  u-  =  0,  hence  from  (31 )  we  get 

A  l2  =  2  Pj  (uj  -  U  )2  -  2  2  Pj  bj  {Uj  -  U)  (37) 

A  sufficient  condition  for  (37)  to  be  nonnegative  is  that 
assumption  C  holds.  So,  in  this  case, one  should  adjust  the 
census  counts  whenever  that  assumption  holds. 

As  in  the  case  of  the  fixed  per  capita  payment  model, 
much  the  same  development  applies  under  the  slightly  more 
general  allocation  model 


Ti  =  C(PjlP)+bj 


where  C  and  bj  are  constants  that  do  not  depend  on  P-  or 


MEASURING  THE  CENSUS  UNDERCOUNT 
IN  CANADA 

In  the  censuses  of  1961,  1966,  1971,  and  1976,  the  main 
vehicle  used  to  measure  the  undercount  has  been  the  so-called 
reverse  record  check  (RRC).  The  methodology  of  this  ve- 
hicle, as  used  in  the  1976  census,  is  briefly  described  below. 
Additional  detail  is  available  in  references  1  and  4. 

For  census  purposes,  persons  are  to  be  enumerated  at 
their  usual  place  of  residence.  There  are,  however,  two 
groups  that  are  included  in  the  final  census  count  but  are  not 
counted  at  their  usual  residence  in  Canada.  The  first  group 
consists  of  diplomatic,  military,  and  other  personnel  (e.g., 
merchant  marine)  living  abroad.  The  second  group  corre- 
sponds to  persons  who  were  enumerated  on  a  special  form 
at  a  temporary  address  (hotels,  motels,  etc.)  at  the  time  of 
the  census  and  who  were  missed  at  their  usual  residence. 

If  a  sample  of  persons  could  be  drawn  from  sources  in- 
dependent of  the  current  census,  and  if  the  current  address 
for  each  selected  person  were  determined,  one  could  directly 
search  the  current  census  file.  Those  who  could  not  be  found 
there  would  represent  a  sample  of  persons  not  enumerated 
at  their  regular  address.  The  weighted  total  based  on  such 
a  sample  conceptually  provides  an  estimate  of  the  number 
of  persons  enumerated  away  from  their  regular  residence, 
plus  those  missed  by  the  census.  Since  the  former  count  is 
independently  known,  the  latter  is  readily  estimated. 

The  sample  frame  for  the  1976  RRC  comprised  four 
nonoverlapping  and  exhaustive  sources: 

(a)  Persons  enumerated  at  their  regular  residence  in  the 
1971  census  (census  frame); 

(b)  Intercensal  births  (birth  frame); 

(c)  Intercensal  immigrants  (immigrant  frame);  and 

(d)  Persons  not  enumerated  at  their  regular  residence  in 
the  1971  census  (missed  frame). 

The  "missed  frame"  in  its  totality  exists  only  concep- 
tually. However,  as  explained  above,  the  RRC  of  the  previ- 


ous census  resulted  in  a  sample  of  such  persons.  The  first 
time  one  carries  out  a  RRC,  frame  (d)  might  not  be  avail- 
able; the  second  time  around,  it  would  fail  to  include  persons 
missed  in  both  of  the  two  preceding  censuses;  generally, 
the  nth  time  around,  frame  (d)  fails  to  include  only  persons 
missed  by  each  of  the  n  preceding  censuses.  Since  being 
missed  is  significantly  affected  by  age,  the  proportion  of 
persons  missed  by  two  or  more  consecutive  censuses  is  prob- 
ably of  rapidly  diminishing  significance. 

Having  selected  a  sample  of  persons  from  the  four  sources, 
an  extensive  effort  was  mounted  to  locate  their  current 
address  (tracing).  The  most  dogged  determination  resulted  in 
97  percent  of  selected  persons  being  traced,  including  those 
who  confirmed  information  that  they  died  or  emigrated 
since  1971.  The  failure  rate  of  tracing,  however,  varied  from 
frame  to  frame:  2.9  and  2.6  percent  for  the  census  and  birth 
frames,  respectively;  8.5  and  5.1  percent  for  the  immigrant 
and  missed  frames.  The  untraced  persons  were  eventually 
treated  as  nonresponse,  in  a  fashion  analogous  to  other 
sample  surveys:  within  each  frame  an  appropriate  ratio  ad- 
justment was  used.  It  is  important  to  note,  however,  that 
persons  listed  in  frames  for  which  tracing  was  more  difficult 
had  a  higher  rate  of  being  missed  by  the  census,  leading  to 
strong  circumstantial  evidence  that  those  not  traced  were 
likely  missed  at  an  even  higher  rate.  Thus  it  is  very  probable 
that,  even  after  weighting  for  this  "nonresponse"  separately 
for  the  different  frames,  the  RRC  estimates  have  a  higher 
bias  (underestimating  the  underenumeration)  whenever  the 
underenumeration  itself  is  higher  (assumptions  A  and  C). 

After  completion  of  tracing,  a  thorough  search  of  census 
records  was  carried  out  to  determine  whether  the  selected 
persons  were  enumerated  at  their  respective  traced  addresses. 
In  cases  where  the  tracing  indicated  that  the  selected  person 
died  prior  to  the  census,  a  search  of  the  death  register  was 
carried  out.  All  persons  not  found  in  either  the  census  or 
the  death  register  were  further  followed  up.  The  objectives 
of  the  followup  were  dual : 

(a)  To  confirm  that  the  traced  address  was  correct,  and/ 
or  to  obtain  other  addresses  where  the  person  may 
have  been  enumerated  in  the  census; 

(b)  To  obtain  at  the  same  time  a  number  of  census  char- 
acteristics for  the  persons  concerned  which,  should 
they  turn  out  to  have  been  missed  by  the  census, 
would  enable  us  to  provide  basic  tabulations  on  the 
characteristics  of  persons  missed  by  the  census. 

Persons  who  could  not  be  traced  in  the  followup  opera- 
tion were  added  to  the  untraced  (nonresponse)  category. 
This  raised  the  untraced  proportion  to  4.8  percent  overall, 
but  still  left  considerable  differences  as  between  the  four 
frames:  4.0  percent  in  the  census  frame,  7.6  percent  in  the 
birth  frame,  10.6  percent  in  the  immigrant  frame,  and  9.6 
percent  in  the  missed  frame.  Undoubtedly,  the  approach  of 
following  up  all  those  traced  persons  who  could  not  readily 
be  found  as  enumerated  in  the  census  is  correct,  but  it  also 
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contributes  to  the  eventual  estimates  of  underenumeration 
being  conservative. 

Appendix  1  shows  the  number  of  selected  persons  in  each 
final  status  category  (enumerated,  missed,  deceased,  emi- 
grated, and  tracing  failed)  for  each  of  the  four  frames.  The 
counts  are  unweighted.  The  overall  proportion  of  missed 
persons  is  higher  than  the  final  estimates  for  Canada  pri- 
marily because  of  two  reasons:  the  percent  "missed"  is 
actually  the  proportion  of  persons  not  enumerated  at  home 
(i.e.,  unadjusted  for  persons  enumerated  at  a  temporary 
residence);  and  the  sample  design  used  included  over- 
sampling  of  young  males  in  the  census  frame  (where  under- 
enumeration was  expected  to  be  high),  so  the  unweighted 
proportion  is  an  overestimate.  Weighted  and  adjusted  esti- 
mates of  the  undercount  for  various  population  and  house- 
hold groups,  are  available  in  reference  4. 

SOME  NUMERICAL  RESULTS 

Appendix  2  shows  the  basic  results  of  the  RRC  by  pro- 
vince, appendix  3,  by  age  and  sex.  Breakdowns  by  other 
classifications  are  available  in  reference  4,  including  the  pro- 
portion of  households  missed  by  their  characteristics.  The 
sample  size  used  was  slightly  over  33,000  persons  nationally, 
from  the  four  frames  combined. 

For  purposes  of  allocating  fixed  per  capita  amounts  to 
the  10  provinces,  the  overall  tests  1  and  3  both  indicate  that 
adjustment  is  appropriate.  This  is  also  the  case  for  test  2 
(the  term-by-term  version  of  test  1)  for  every  province.  Test 
4  is  positive  only  for  the  four  largest  provinces  (Ontario, 
Quebec,  British  Columbia,  Alberta)  and  New  Brunswick 
(because  of  the  high  value  of  u-).  Test  1  is  very  "comfort- 
ably" satisfied  —the  sample  could  be  reduced  by  a  factor  of 
36  and  adjustment  would  still  be  indicated  (i.e.,  even  with 
an  overall  sample  size  of  less  than  1,000  distributed  over  10 
provinces).  Test  3  would  be  positive  even  with  a  sample  size 
reduced  by  a  factor  of  4  (less  than  8,000  selected  persons 
nationally). 

In  the  case  of  the  fixed  total  cost  case,  tests  1  and  2  are 
both  positive.  Test  2  would  be  positive  even  with  a  60- 
percent  reduction  of  the  sample  size  actually  used.  Test  1 
would  be  positive  with  a  sample  size  one-fifth  as  large  as 
actually  used,  i.e.,  with  about  6,600  persons.  If  one  were  to 
construct  a  term-by-term  equivalent  of  tests  1  and  2  (as  in 
the  case  of  the  fixed  per  capita  allocation),  the  correspond- 
ing tests  would  be  all  positive  in  the  case  of  test  1,  except 
for  New  Brunswick  (whose  estimated  underenumeration  is 
very  close  to  the  national  average);  in  the  case  of  test  2,  none 
of  the  term-by-term  equivalents  are  positive. 

It  is  interesting  to  consider  one  of  the  major  Canadian 
Federal-provincial  transfer  payment  schemes.  Skipping  the 
detailed  analysis  of  the  legislation,  it  turns  out  that  the 
relevant  formulas  to  use  are  provided  by  the  fixed  total 
payment  tests,  but  extending  the  summation  over  only  seven 
provinces     (Newfoundland,     Prince    Edward     Island,    Nova 


Scotia,  New  Brunswick,  Quebec,  Manitoba,  Saskatchewan), 
while  still  retaining  the  national  average  3.  Applying  tests  1 
and  2  of  the  fixed  total-cost  model  in  this  fashion,  they  are 
still  positive,  but  test  2  just  barely  so.  An  overall  proportional 
reduction  of  the  sample  by  even  1  percent  would  render 
the  sign  of  test  2  negative. 

OPTIMUM  ALLOCATION  OF  SAMPLE 

The  question  naturally  arises  how  to  allocate  the  sample 
designed  to  measure  the  undercounts  u-  so  as  to  make  the 
adjustments  result  in  maximum  precision.  If  the  objective 
is  to  maximize  the  reduction  in  inequity  after  the  adjust- 
ment, then  one  wants  to  maximize  test  1  (in  both  the  fixed 
per  capita  allocation  and  fixed  total-cost  cases). 

Let  us  assume,  for  the  sake  of  simplicity,  that  the  variance 
of  U:  decreases  inversely  proportionate  with  n-.  Then  denote 

~      "/ 

var  u;  =  — 

/    n, 

where  v-  includes  all  applicable  design  effects.  In  the  case  of 
the  fixed  total-cost  allocation,  we  will  neglect  the  effect  of 
var  u  in  formulas  (33)  and  (36).  (Its  numerical  impact  is 
extremely  small  and  most  unlikely  to  influence  optimal 
sample  allocation.) 

It  is  now  easy  to  verify  that  the  optimal  allocation  (with 
fixed  overall  sample  size  n)  that  maximizes  the  value  of  test 
1  is 


/?•  =  n 


X\TPf~ 


(38) 


The  same  result  applies  in  the  case  of  both  fixed  per  capita 
and  fixed  total-cost  allocations.  If  the  differences  in  the 
provincial  design  effects  and  in  the  proportions'  u-  are  not 
large,  a  reasonable  approximation  of  the  above  is  to  allocate 
the  sample  in  proportion  to  the  square  roots  of  the  provincial 
populations.  This  has  merit,  otherwise,  as  well  as  a  reasonable 
compromise  between  designing  the  sample  for  obtaining  the 
best  national  estimate  of  underenumeration  (n-  proportional 
to  Pj)  and  obtaining  equal  reliability  for  every  u-  (n-  equal 
for  all  /). 

If  we  want  to  maximize  the  chance  of  being  able  to  adjust 
the  census  counts  with  security,  we  would  want  to  allocate 
the  sample  to  maximize  the  value  of  test  3  in  the  case  of 
fixed  per  capita  allocation  and  of  test  2  in  the  case  of  fixed 
total-cost  allocation. 

It  can  be  verified  that  the  following  allocation  would 
achieve  this  second  objective: 


where 


ni  =  nXa 


a2  =  to2  x2  v-  +p-  v- 


(39) 


201 


and  where,  further, 


X:     =  U:  in  the  fixed  per  capita  allocation  case 
=  u—u   in  the  fixed  total-cost  allocation  case 


and  t  is  the  positive  root  of  the  equation  below 

n 


*2  _ 


2       2 

P;  x;    v; 
(2     '     '      ')  (Sa7) 


/ 


The  right-hand  side  above  involves  t,  but  the  equation  can 
be  solved  iteratively,  starting  with  an  initial  value  of  t=0.  In 
fact,  the  convergence  appears  to  be  very  rapid  (the  first  com- 
puted value  of  t  in  several  examples  was  correct  to  three 
significant  digits). 

If  t  were  equal  to  zero,  the  allocation  of  (39)  would  re- 
duce to  that  of  (38).  For  large  values  of  t  (and  if  the  values 
X:  and  V;  do  not  vary  much  among  the  provinces),  the  allo- 
cation would  be  close  to  proportional  to  the  provincial 
populations  p-. 

The  following  example  illustrates  the  differences  between 
the  two  allocations.  Let 

Xj  =    0.01       (about  the  average  value  of  \u--  u\  in  the 

Canadian  reverse  record  check;  see  appendix 
1) 

Vj  =   0.02      (close  to  the  average  value  of  u-  (1-t/-) 

in  the  reverse  record  check;  appendix  1). 

Then  f=7.14x10-4  when  0=33,000  (the  approximate  re- 
verse record  check  sample  size).  Table  1  illustrates  the  pro- 
portion of  the  sample  to  be  allocated  to  each  of  10  provinces 
when  n. is  proportional,  respectively,  to  yfpj,  a.,  and  p..  As 
expected,  the  allocation  that  is  proportional  to  a •  is  in  be- 
tween the  allocations  proportional  top-  and  \/~pJ. 

Table  2  shows  the  realized  values  of  tests  1  and  2  with 
the  parameters  as  indicated  above  (but  with  the  value  of  var 
u  neglected  in  the  case  of  test  2). 

As  usual,  the  optima  appear  to  be  broad.  The  allocation 

Table  1.  Alternative  Sample  Allocations 


Province 


Rev.  rec. 

check 

(actual) 


Newfoundland 0.057 

Prince  Edward  Island  .  0.026 

Nova  Scotia 0.070 

New  Brunswick  ....  0.063 

Quebec 0.191 

Ontario 0.220 

Manitoba 0.077 

Saskatchewan 0.073 

Alberta 0.104 

British  Columbia   .  .  .  0.120 


0.039 

0.024 

0.053 

0.016 

0.005 

0.051 

0.050 

0.036 

0.051 

0.044 

0.030 

0.052 

0238 

0.272 

0.213 

0.307 

0.360 

0.224 

0.057 

0.045 

0.056 

0.053 

0.040 

0.055 

0.087 

0.080 

0.093 

0.109 

0.108 

0.152 

Table  2.  Realized  Values  of  Tests  1  and  2  Under  Alternative 
Allocations  (fixed  total  payment  model) 


Rev.  rec. 

check 

(actual) 


Test  1 . 
Test  2. 


2,085 
1293 


2,068 
1,339 


2,015 
1,301 


2,076 
1296 


proportional  to  \fp~ maximizes  test  1,  while  the  allocation 
proportional  to  a-  maximizes  test  2.  There  is  not  much  to 
choose  among  them.  The  actual  reverse  record  check  allo- 
cation was  close  to  the  \/p~  allocation,  but  adjusted  for  two 
considerations:  (a)  To  provide  acceptable  estimates  of  under- 
enumeration  for  every  province  (thus  the  significant  upward 
sample  size  adjustment  for  Prince  Edward  Island);  and  (b) 
to  take  into  account  the  forecast  values  of  u-  and  v-  (thus 
also  the  significant  upward  adjustment  in  British  Columbia, 
where  both  u-  and  v-  were  forecast  and  turned  out  to  be 
significantly  higher  than  elsewhere). 

In  concluding  this  section,  it  is  perhaps  worth  once  again 
pointing  out  the  somewhat  curious  phenomenon  that  the 
optimal  sample  allocation  that  maximizes  test  1  (the  esti- 
mated reduction  in  the  inequity  measures  between  the 
unadjusted  and  adjusted  allocations)  is  different  from  the 
optimal  sample  allocation  that  maximizes  test  2  (which, 
roughly  speaking,  ensures  that  the  adjusted  census  counts 
actually  decrease  the  inequity  with  a  high  probability). 

CONCLUDING  REMARKS 

As  emphasized  in  the  introduction,  the  present  paper  does 
not  purport  to  provide  a  tool  for  the  definitive  determina- 
tion of  the  answer  to  the  question  whether  the  census  counts 
should  be  adjusted— even  for  purposes  of  fund  allocation  and 
even  if  the  appropriate  tests  are  positive.  The  tests  deal  only 
with  the  issue  of  equity,  and  then  only  for  formulas  that 
depend  on  census  counts  in  a  direct  and  untransformed 
fashion.  This  is  not  the  case  regarding  the  Canadian  Fiscal 
Arrangements  Act  payments,  except  in  census  years;  in  all 
other  years,  the  formula  uses  intercensal  population  esti- 
mates. It  is  yet  to  be  determined  whether  suitable  intercensal 
population  estimates  can  be  prepared  starting  with  the 
adjusted  census  counts. 

Some  legislated  fiscal  transfers  depend  on  counts  of  a 
subgroup  of  the  population  (e.g.,  students,  or  children  in 
low  income  families).  While  one  of  the  advantages  of  the 
reverse  record  check  methodology,  as  described  in  the 
previous  section,  is  its  ability  to  produce  breakdowns  of  the 
number  of  persons  missed  by  the  census  by  their  charac- 
teristics (as  such  it  is  different  from,  for  example,  analytic 
estimates  of  the  underenumeration  rate),  the  currently  used 
sample  sizes  may  not  be  adequate  for  adjusting  the  census 
counts  for  relatively  small  subgroups  of  the  population. 

Intercensal  population  estimates  are  also  used  in  the 
weighting  of  current  household  survey  data  through  ratio 


202 


estimates.  These  survey  estimates  may,  in  turn,  be  used  to 
determine  large  transfers  of  funds.  This  is  the  case,  for 
example,  in  the  Canadian  Labour  Force  Survey,  whose  esti- 
mates of  the  unemployment  rate  (in  each  of  some  50  regions) 
determine  the  unemployment  insurance  benefits.  Since  un- 
employment is  significantly  higher  in  the  category  where 
the  census  undercount  is  also  worst,  i.e.,  young  males,  it  is 
conceivable  that  if  the  ratio  estimate  could  be  based  on 
adjusted  intercensal  population  estimates,  the  benefits  to  be 
received  by  some  Canadians  could  be  affected.  However, 
the  appropriate  adjustment  depends  not  only  on  the  method- 
ology of  producing  adjusted  intercensal  estimates  but  also 
on  being  able  to  adjust  (even  for  the  census  year)  by  pro- 
vince and  age-sex  groups,  which  the  current  RRC  sample 
size  may  not  support.  Research  is  currently  underway  to 
investigate  the  feasibility  of  producing  synthetic  adjustment 
factors  from  separate  sets  of  estimates  of  underenumeration 
by  province  and  by  age-sex  groups. 


Should  adjusted  census  counts  be  used  for  one  legislated 
use  (such  as  the  Federal  Provincial  Fiscal  Arrangements), 
can  the  use  of  unadjusted  counts  be  defended  for  other 
legislated  uses  (such  as  the  determination  of  unemployment 
insurance  benefits)? 

Another  problem  raised  by  Nisselson  [3]  is  that,  by  ad- 
justing the  census  counts,  we  may  remove  an  incentive  that 
otherwise  might  motivate  national  associations  of  difficult- 
to-enumerate  groups,  such  as  certain  minorities,  to  urge 
their  members  to  cooperate  with  the  census. 

The  above  is  only  a  very  partial  list  of  policy  issues  that 
have  to  be  addressed  before  a  decision  can  be  made  on 
whether  the  census  counts  should  be  adjusted  and,  if  so, 
for  what  purpose.  One  thing  is  certain,  as  Keyfitz  [3] 
pointed  out,  the  decision  on  the  issue  should  be  decided 
publicly  and  before  the  census  is  taken,  so  that  the  argu- 
ments for  and  against  adjustment  can  be  considered  in  the 
least  charged  atmosphere. 


APPENDIX  1 
NUMBER  OF  CASES  IN  EACH  FINAL  STATUS  CATEGORY  BY  FRAME 


Result 


Census  f 

rame 

Birth  frame 

Immigrant  frame 

Missed  frame 

All  frames 

Number 

Percent 

Number 

Percent 

Number 

Percent 

Number 

Percent 

Number 

Percent 

27,913 

100.0 

3,262 

100.0 

1,169 

100.0 

767 

100.0 

33,111 

100.0 

24,890 

89.2 

2^71 

88.0 

869 

74.3 

584 

76.1 

29,214 

88.2 

645 

2.3 

66 

2.0 

80 

6.8 

53 

6.9 

844 

2.5 

978 

3.5 

47 

1.4 

1 

0.1 

36 

4.7 

1,062 

3.2 

272 

1.0 

29 

0.9 

95 

8.1 

20 

2.6 

416 

1.3 

1,128 

4.0 

249 

7.6 

124 

10.6 

74 

9.6 

1,575 

4.8 

Total  .  .  . 
Enumerated  . 
Missed  .... 
Deceased  .  .  . 
Emigrated  .  . 
Tracing  failed 


APPENDIX  2 
ESTIMATED  POPULATION  UNDERCOVERAGE  BY  PROVINCE 


Population 
Undercoverage  rate 

Number  of  persons 
missed 

1976 

census 

population 

total 

1976  census 
population 
adjusted  for 

under- 
coverage1 2 

1976 
census 

distribu- 
tion2 

(percent) 

1976  census 
distribution 

Province 

Estimated 

rate 
(percent) 

Standard 

error 
(percent) 

Estimated 
number1 

Standard 
error 

adjusted  for 

undercoverage1 

(percent) 

Newfoundland 

3 1.10 

0.39 

3  6,200 

2,200 

557,725 

563315 

2.43 

2.41 

Prince  Edward  Island  . 

3  0.38 

0.25 

3  445 

295 

118,230 

118,675 

0.52 

0.51 

30.86 

0.34 

37,215 

2,870 

828,570 

835,785 

3.61 

3.57 

New  Brunswick  .... 

2.16 

0.37 

14,960 

2,615 

677,250 

692310 

2.95 

2.96 

Quebec  

2.95 

0.25 

189,655 

16,225 

6,234,445 

6,424,095 

27.19 

27.45 

Ontario 

1.52 

0.17 

127,155 

14,170 

8,264,465 

8,391,625 

36.05 

35.85 

Manitoba 

3 1.07 

0.33 

3 1 1 ,080 

3,455 

1,021,510 

1 ,032,590 

4.46 

4.41 

3 1.33 

0.34 

3 12,440 

3,175 

921,320 

933,760 

4.02 

3.99 

Alberta 

1.49 

0.26 

27,790 

4,855 

1  ^38,035 

1 ,865,830 

8.02 

7.97 

British  Columbia    .  .  . 

3.13 

0.31 

79,775 

7,770 

2,466,605 

2,546,385 

10.76 

10.88 

All  Ten  Provinces  .  .  . 

2.04 

0.10 

476,715 

23,890 

22^28,155 

23,404,870 

100.00 

100.00 

Note:  Yukon  and  Northwest  Territories  excluded. 

1  The  marginal  totals  or  percentages  may  differ  slightly  from  the  sum  of  individual  totals  or  percentages  due  to  rounding. 

2  The  standard  error  figure  for  the  corresponding  estimated  number  of  missed  persons  also  applies  to  these  totals. 

3  Estimate  with  high  relative  standard  error. 
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APPENDIX  3 
ESTIMATED  POPULATION  UNDERCOVERAGE  BY  AGE  AND  SEX 


Population 
undercoverage  rate 

Number  of  persons 
missed 

1976 

census 

population 

total 

1976  census 

population 

adjusted  for 

under 
coverage1  2 

1976 
census 

distribu- 
tion2 

(percent) 

1976  census 
distribution 

Age  and  sex 

Estimated 

rate 
(percent) 

Standard 

error 
(percent) 

Estimated 
number1 

Standard 
error 

adjusted  for 

undercoverage1 

(percent) 

MALE 

Total 

2.46 

0.17 

287,830 

19,990 

11,415,370 

1 1 ,703,200 

100.00 

100.00 

0-4 

2.53 

0.46 

22,950 

4,240 

884,750 

907,695 

7.75 

7.76 

5-14 

1.14 

0.21 

24,520 

4,590 

2,123,480 

2,148,005 

18.60 

18.35 

15-19 

3 1.93 

0.48 

3  23 ,480 

5,995 

1,192,600 

1,216,075 

10.45 

10.39 

20-24 

5.99 

0.52 

67,680 

6,270 

1 ,062,345 

1,130,025 

9.31 

9.66 

25-34 

3.64 

0.46 

68,670 

9,035 

1,816,765 

1 ,885,430 

15.92 

16.11 

35-44 

3  2.33 

0.48 

3  31, 300 

6,580 

1,310,970 

1,342,265 

11.48 

11.47 

45-54 

3 1.63 

0.41 

3  20,310 

5,240 

1,223,510 

1,243,820 

10.72 

10.63 

55-64 

3 1.28 

0.34 

3 12,000 

3,260 

926,530 

938,530 

8.12 

8.02 

3 1.90 

0.44 

3 16,925 

3^90 

874,430 

891,355 

7.66 

7.62 

FEMALE 

Total 

1.61 

0.10 

188,885 

12,085 

11,512,785 

11,701,670 

100.00 

100.00 

0^4 

2.07 

0.36 

17,770 

3,180 

839,645 

857,420 

7.29 

7.33 

5-14 

3 1.26 

0.27 

3  25 ,865 

5,640 

2,025,445 

2,051,305 

17.59 

17.53 

15-19 

3  2.05 

0.51 

3  23 ,990 

6,060 

1,146,135 

1,170,135 

9.96 

10.00 

20-24 

4.62 

0.48 

51,605 

5,660 

1,064,770 

1,116,375 

9.25 

9.54 

25-34 

2.03 

0.38 

37,090 

7,045 

1,791,680 

1,828,770 

15.56 

15.63 

35-44 

30.72 

0.24 

39,335 

3,085 

1,278,930 

1 ,288,260 

11.11 

11.01 

45-54 

3  0.81 

0.38 

310,135 

4,820 

1 ,244,735 

1 ,254,865 

10.81 

10.72 

55-64 

3  0.58 

0.25 

3  5,81 5 

2,495 

995,295 

1,001,110 

8.65 

8.56 

65  and  over   

3  0.64 

0.38 

3  7,280 

4,305 

1,126,150 

1,133,430 

9.78 

9.69 

Note:  Yukon  and  Northwest  Territories  excluded. 

1  The  marginal  totals  or  percentages  may  differ  slightly  from  the  sum  of  individual  totals  or  percentages  due  to  rounding. 

2The  standard  error  figure  for  the  corresponding  estimated  number  of  missed  persons  also  applies  to  these  totals. 

3  Estimate  with  high  relative  standard  error. 
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INTRODUCTION 

The  1980  census  will  be  subject  to  undercount,  and  esti- 
mates of  the  undercount  will  be  prepared  by  one  or  more 
techniques.  The  presence  of  error  in  the  population  figures 
implies  that  various  subgroups  of  the  population  may  be 
partially  denied  societal  benefits  to  which  they  are  entitled. 
These  benefits  include  equal  political  representation  as  well 
as  monetary  grants-in-aid  under  numerous  Federal  programs. 
For  example,  errors  in  the  1970  census  counts  for  States 
could  have  caused  one  or  more  States  to  lose  a  seat  in  the 
U.S.  House  of  Representatives  [13,  14] .  There  are  also  over 
100  Federal  programs  that  allocate  funds  at  least  partly  on 
the  basis  of  population  data.  The  largest  such  program  is  gen- 
eral revenue  sharing,  which  currently  allocates  nearly  $7 
billion  per  year  to  State  and  sub-State  governmental  juris- 
dictions. Some  programs,  such  as  CETA  (Comprehensive 
Employment  and  Training  Act),  use  population  data  as 
thresholds  across  which  areas  must  pass:  to  qualify  as  a  prime 
sponsor  under  CETA,  an  area  must  comprise  at  least  100,000 
population. 

Given  an  estimate  of  the  undercount,  one  may  adjust  the 
census  count  and  then  recalculate  the  allocations  of  House 
seats  or  general  revenue  sharing  or  other  funds  to  govern- 
ments or  other  groups  (4,  11-14,  16,  17]  .  The  observation 
of  differences  between  the  adjusted  allocations  (i.e.,  the 
allocations  based  on  data  adjusted  for  estimated  undercount) 
and  the  unadjusted  allocations  has  given  rise  to  perceptions 
of  inequity  and  the  desire  by  many  to  have  the  data  adjusted 
for  undercount. 

Unfortunately,  the  estimates  of  undercount  are  themselves 
subject  to  error.  As  a  consequence  of  this  error,  the  adjusted 
allocations  may  be  further  from  the  target1  than  are  the 
unadjusted  allocations  (see  the  section  below). Thus,  adjust- 
ment may  be  inequitable  for  some  participants  in  the  alloca- 
tion process.  If  the  adjustments  are  to  be  equitable  overall, 
we  must  carefully  consider  both  the  accuracy  of  the  esti- 
mates of  undercount  and  what  in  fact  we  mean  by  equity 
in  this  context. 

This  paper  will  first  address  considerations  of  accuracy 
and  equity  separately.  We  next  consider  how  to  make  ad- 
justments that  maximize  equity,  subject  to  the  accuracy  of 
the  estimates  of  undercount  and  given  criteria  of  equity. 
Illustrative  calculations  will  be  presented. 


1  Here,  target  refers  to  the  allocations  that  would  be  calculated 
if  perfect  data  were  available. 


The  purposes  of  the  paper  are  to  (1)  identify  the  questions 
of  what  we  mean  by  equity  and  to  formulate  equity  concerns 
in  useful  ways;  (2)  heuristically  consider,  using  simple  ex- 
amples, what  implications  various  concepts  of  equity  have 
for  undercount  adjustment;  and  (3)  identify  a  few  questions 
for  further  consideration  and  research.  Our  considerations  of 
equity  will  be  restricted  to  the  area  of  fund  allocations  based 
on  census  data. 

ACCURACY  OF  ESTIMATES  OF 
UNDERCOUNT 

Estimates  of  undercount  are  based  on  data  and  on  as- 
sumptions and  thus  they  are  subject  to  error.  The  potential 
significance  of  this  error  is  illustrated  by  the  following 
example. 

Example  1 

Suppose  a  fixed  sum  of  money  is  to  be  allocated 
among  States  in  proportion  to  total  State  population 
(for  convenience  we  refer  to  the  District  of  Columbia  as 
a  State).  Then  the  fraction  of  the  total  going  to  each 
State  equals  the  estimate  of  the  ratio  of  State  population 
to  total  population.  Note  that  if  all  States  had  the  same 
relative  undercount  (undercount  expressed  as  a  fraction  of 
true  population),  these  ratios  would  be  unaffected.  How- 
ever, it  is  estimated  that  States  do  not  all  have  the  same 
relative  undercount;  those  whose  relative  undercount  is 
less  than  the  national  relative  undercount  are  overallo- 
cated  funds,  and  those  whose  relative  undercount  is 
greater  than  the  national  relative  undercount  are  under- 
allocated  funds.  If  adjusted  population  figures  are  used, 
some  States  will  gain  money  and  some  will  lose,  according 
to  whether  the  percent  increase  in  State  population  from 
adjustment  is  greater  or  less  than  the  percent  increase  in 
national  population  from  adjustment. 

The  Census  Bureau  has  used  alternative  methods  to 
prepare  estimates  of  undercount  for  States  in  1970 
[14].  For  each  of  these  alternative  methods,  a  set  of 
adjusted  population  counts  for  States  may  be  simply 
obtained  by  adding  the  estimate  of  undercount  to  the 
1970  census  count  for  each  State.  It  is  interesting  to 
compare  the  changes  in  allocation  under  two  alternative 
adjustments,  based  on  undercount  estimates  derived  by 
the  basic  synthetic  method  and  by  composite-2  method. 
(See  fig.  1  [14].) 
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Figure  1.  Percent  changes  in  population-based  distribution  of  a  fixed  sum  of  money 
among  States  when  1970  census  counts  are  adjusted  by  the  basic  synthetic 
method  and  composite-2  method.  Adapted  from  National  Research  Council, 
Counting  the  People  in  1980:  An  Appraisal  of  Census  Plans.  Washington,  D.C.: 
National  Academy  of  Sciences,  1978,  SPRR  table  Vlll-A. 


206 


This  scatter  diagram  illustrates  two  important  points. 
First,  the  magnitudes  of  the  changes  in  allocation  can 
differ  under  alternative  estimates  of  undercount.  Here 
the  changes  under  the  composite-2  method  (horizontal 
axis)  are  generally  more  extreme  than  those  under  the 
basic  synthetic  method  (vertical  axis).  The  second  point 
to  be  appreciated  is  that  the  directions  of  the  adjustments 
can  differ  under  alternative  estimates  of  undercount.  Here 
the  directions  of  adjustment  under  composite-2  method 
and  under  the  basic  synthetic  method  differed  for  17 
States:  Alaska,  Arizona,  California,  Hawaii,  Idaho,  Illinois, 
Kentucky,  Louisiana,  Maine,  Maryland,  Michigan,  Nevada, 
New  Jersey,  New  Mexico,  South  Dakota,  Vermont,  and 
West  Virginia.  If  we  consider  other  estimates  of  under- 
count as  well  [14,  tables  Vll-D  or  F-1],  we  may  note 
that  the  directions  of  adjustment  also  vary  for  Alabama, 
Colorado,  and  New  York. 

In  example  1,  we  do  not  know  for  which  States  which 
method  gives  more  accurate  estimates  of  undercount.  A  point 
to  be  appreciated  is  that  whatever  estimates  of  undercount 
are  used  to  adjust,  they  will  be  subject  to  error,  and  there  is 
a  probability  that  the  adjustments  will  (1)  move  some  of  the 
allocations  farther  from  the  target,  and  (2)  result  in  an  under- 
allocation  to  a  State  (or  other  group)  that  was  previously 
not  subject  to  underallocation.  In  order  to  assess  these 
probabilities,  it  is  necessary  to  have  quantitative  estimates  of 
the  accuracy  of  the  estimates  of  undercount. 

Estimates  of  accuracy  may  be  completely  specified  as 
probability  distributions  for  the  estimates  of  undercount 
(or  equivalently,  for  the  alternative  population  estimates, 
where  these  are  equal  to  the  sum  of  the  population  count 
and  the  estimated  undercount).  For  example,  we  might 
believe  that  the  estimate  of  relative  undercount  of  an  area 
provided  by  a  certain  method  had  a  normal  distribution  with 
mean  0  and  standard  deviation  .01.  Deriving  these  individual 
(i.e.,  marginal)  probability  distributions  is  a  challenging  task. 
Deriving  a  joint  probability  distribution  will  be  even  more 
difficult. 

Different  techniques  will  be  needed  to  estimate  the 
marginal  distributions  for  estimates  provided  by  different 
methods.  For  example,  standard  statistical  theory  may  be 
used  to  estimate  the  variances  of  dual  systems  estimates  [1] , 
but  estimates  of  biases  (e.g.,  response  correlation  bias  and 
matching  bias;  see  [9])  which  may  dominate  the  variances  re- 
quire other  approaches.  To  estimate  the  distributions  for  the 
demographic  method  estimates  requires  different  techniques. 
The  demographic  method  uses  complicated  models  and 
diverse  data  sources  to  construct  estimates  of  undercount. 
Some  of  the  parameters  in  the  models  represent  guesses,  and 
the  data  in  the  models  are  used  in  complex  ways.  A  possible 
but  untried  approach  to  estimate  the  distributions  of  the 
demographic  estimates  is  to  write  out  the  models  completely 
and  explicitly,  specify  probability  distributions  for  para- 
meters and  data  used  in  the  models,  and  use  the  delta 
method  to  produce  estimates  of  bias  and  variance. 


CONSIDERATIONS  OF  EQUjTY 

Because  the  estimates  of  undercount  are  subject  to  error, 
it  is  likely  that  the  adjustment  will  improve  the  accuracy  of 
the  allocations  to  some  parties  but  will  decrease  the  accuracy 
of  allocations  to  other  parties.  That  is,  adjustment  will  move 
some  allocations  closer  to  those  that  would  occur  with 
perfect  data  but  will  move  other  allocations  further  away.  In 
addition,  adjustment  will  cause  underallocations  to  some 
parties.  If  the  adjustments  are  in  fact  to  be  equitable  overall, 
we  need  to  carefully  consider  what  we  mean  by  equity. 
Equity  is  generally  taken  to  refer  to  the  "spirit  and  the  habit 
of  fairness,  justness,  and  right  dealing"  [2]  .  This  may  be 
interpreted  in  various  ways,  and  for  the  present  purposes  we 
need  to  settle  on  one  explicit  interpretation.  Since  there  is 
no  unique  interpretation  of  equity,  it  should  be  considered 
as  a  convention,  i.e.,  equity  is  what  we  agree  it  is  for  the 
practical  purposes  of  determining  undercount  adjustment. 
Convention  is  used  here  in  the  sense  of  Keyfitz  [7] . 

Agreeing  on  a  convention  could  involve  much  political 
debate.  However,  once  a  convention  of  equity  is  agreed 
upon,  it  becomes  a  technical  matter  to  determine  how  best 
to  achieve  the  agreed-upon  measure  of  equity.  The  statistical 
operations  would  be  insulated  from  the  political  give-and- 
take,  with  two  results:  (1)  Less  chance  of  politicization  of 
statistical  agencies,  and  (2)  more  efficient  achievement  of 
policy  goals  (political  debate  would  be  freed  from  statistical 
encumbrances,  so  political  actors  can  focus  on  specifying 
desires  rather  than  statistical  methods). 

Here  we  will  consider  equity  as  it  pertains  to  two  aspects 
of  the  adjustment  of  census  counts  and  allocations.  First  we 
consider  the  equity  of  the  end  product  of  the  census-taking 
and  adjustment  processes  as  a  whole,  that  is,  the  equity  of 
the  allocations.  Next,  we  consider  equity  as  it  applies  to  ad- 
justment as  a  process  in  itself,  separate  from  the  census- 
taking  operation.  In  both  cases,  we  need  to  formulate  ex- 
plicit and  precise  statements  of  what  we  mean,  operationally, 
by  equity.  In  particular,  as  we  shall  see  in  section  3,  the 
convention  of  equity  that  is  adopted  will  influence  our 
choice  of  adjusted  figures. 

Equity  of  Allocations 

To  consider  the  equity  of  the  allocations,  we  will  adopt 
the  perspective  of  decision  theory  and  construct  measures  of 
equity  to  reflect  our  preferences  over  alternative  sets  of 
adjusted  figures.  Thus,  considerations  will  not  be  limited  to 
narrow  interpretations  of  equity  as  fairness,  but  include 
social  welfare  concerns  as  well.  In  particular,  we  will  consider 
the  formulation  of  rules  that  rank  different  sets  of  adjusted 
figures,  or  allocations,  according  to  our  notions  of  equity. 
For  concreteness,  we  consider  the  allocation  of  a  fixed  sum 
among  several  parties.  The  allocations  will  usually  beassumed 
proportional  to  population  size.  Our  concern  here  is  to  com- 
pare the  equity  of  different  allocations. 
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We  take  as  a  starting  point  the  assumption  that  when  a 
set  of  allocations  is  identical  to  the  set  of  target  allocations, 
there  is  maximum  equity.  We  also  refer  to  this  as  zero  in- 
equity. It  is  important  to  assume  that  the  target  values 
coincide  with  those  allocations  that  would  occur  if  there 
were  no  error  in  the  data.  Of  course,  the  actual  allocation 
formulas  are  themselves  imperfect,  and  it  is  probably  argu- 
able in  individual  cases  that  greater  social  good  can  occur 
under  some  perturbation  of  the  target  values  defined.  How- 
ever, these  concerns  should  not  be  brought  into  the  context 
of  adjusting  census  counts  because  of  data  inaccuracy— they 
pertain  to  the  allocation  formulas  rather  than  to  the  data 
used.  For  further  discussion  of  statistical  equity  (arising  from 
good  data)  and  political  equity  (arising  from  good  formulas), 
the  reader  is  referred  to  the  paper  by  Robert  Hill,  also  in  this 
publication. 

Any  possible  deviations  of  allocations  from  the  target 
values  (as  defined  above)  imply  less  than  maximum  equity, 
or  a  degree  of  inequality.  Given  two  alternative  sets  of  allo- 
cations, both  imperfect,  it  is  important  to  be  able  to  com- 
pare their  levels  of  inequity. 

Example  2 

Suppose  for  simplicity  that  there  are  three  parties  par- 
ticipating in  the  population-based  allocation  of  $100 
million.  Assume  also  that  each  party  has  the  same  popu- 
lation size  (hence  the  same  target  value)  and  that  the 
three  parties  are  labeled  1,  2,  and  3. 

Consider  two  alternative  sets  of  allocations  with  the 
following  properties: 


Error  in  allocation  3 
(millions  of  dollars) 


Error  in  allocation  1 
(millions  of  dollars) 


Party  1 

-4.0 

Party  2 

+  2.0 

Party  3 

+  2.0 

Error  in  allocation  2 
(millions  of  dollars) 

-2.0 
+  1.0 
+  1.0 


To  decide  which  allocation  is  more  equitable,  it  is  neces- 
sary to  have  some  criterion  for  comparison.  Notice  that 
under  allocation  2  each  component  misallocation  (i.e., 
the  error  in  allocation  to  each  party)  is  smaller  than  under 
allocation  1.  Allocation  2  is  closer  than  allocation  1  in 
every  way  to  the  target  values,  and  it  ought  therefore  to 
be  considered  more  equitable. 

Other  situations  invite  more  disagreement  about  which 
set  of  allocations  is  more  equitable.  Consider  allocations 
3  and  4  below: 


Error  in  allocation  3 
(millions  of  dollars) 

Erroi 
(mill 

•  in  allocation  4 
ions  of  dollars) 

Party  1 
Party  2 
Party  3 
Sum  of  misal loca- 

+1.6 

-  .8 

-  .8 

-1.5 

+1.5 

0 

tions  (sign 
disregarded) 

3.2 

3.0 

Sum  of  squared 

misallocations 

3.8 

Total  underallo- 

cations 

1.6 

Sum  of  squared 

underallocations 

1.3 

Largest  under- 

al  location 

.8 

Error  in  allocation  4 
(millions  of  dollars) 


4.5 
1.5 
2.2 
1.5 


If  our  concerns  for  equity  center  on  the  total  amount  of 
money  that  is  misallocated  or  on  the  total  amount  under- 
allocated,  then  allocation  4  is  preferable.  The  total  mis- 
allocated  under  allocation  3  is  $3.2  million,  while  the 
total  for  allocation  4  is  $3.0  million.  The  total  under- 
allocations are  $1 .6  million  and  $1 .5  million,  respectively. 
Hence,  allocation  4  is  more  equitable.  Alternatively,  we 
might  think  that  large  errors  are  disproportionately  worse 
than  small,  so  that  the  -$1.5  million  error  to  party  1 
under  allocation  4  is  more  inequitable  than  the  combined 
effect  of  the  two  -$0.8  million  errors  to  parties  2  and  3 
under  allocation  3. 

Loss  Functions 

In  adjusting  census  counts  and  allocations,  it  is  not  possi- 
ble to  directly  consider  the  comparative  inequities  of  all 
likely  sets  of  allocations.  For  one  thing,  there  are  too  many 
allocations  possible.  Also,  we  have  not  yet  specified  precisely 
how  we  want  to  evaluate  equity  so  that  we  can  make  the 
comparisons.  A  solution  to  this  dilemma  is  to  utilize  explicit 
loss  functions  to  represent  an  agreed-upon  conception  of 
equity.  Loss  functions  are  constructs  that  concisely  and 
tractably  represent  one's  preferences.  In  the  present  context, 
a  loss  function  will  assign  a  number  to  each  set  of  allocations 
in  such  a  way  that  whenever  a  set  of  allocations  A  is  more 
equitable  (under  an  agreed-upon  conception  of  equity)  than 
another  set  of  allocations  B,  the  loss  function  assigns  a 
greater  number  to  B  than  to  A.  (To  avoid  possible  confusion, 
it  should  be  noted  that  since  different  loss  functions  repre- 
sent different  preferences,  it  is  not  meaningful  to  compare 
the  values  of  different  loss  functions.  Only  the  values  of  the 
same  loss  function  over  alternative  allocations  are  com- 
parable.) 

Example  3 

Consider  once  again  allocations  3  and  4  as  described 
in  example  2.  If  equity  is  to  be  judged  on  the  basis  of  the 
total  amount  of  money  misallocated,  then  we  may  define 
the  loss  function  to  be  proportional  to  the  sum  of  the 
absolute  values  of  the  errors  in  allocation.  Thus  the  nu- 
merical value  taken  by  the  loss  function  is  3.2  for  alloca- 
tion 3  and  3.0  for  allocation  4.  (The  proportionality 
constant  does  not  matter.)  The  loss  function  reliably 
represents  our  preferences,  telling  us  that  allocation  4  is 
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more  equitable  than  allocation  3  under  this  conception 
of  equity.  Applying  the  same  loss  function  to  allocations 
1  and  2  of  example  2,  we  observe  that  the  value  of  the 
loss  function  is  8.0  for  allocation  1  and  4.0  for  allocation 
2,  so  that  allocation  2  is  more  equitable  than  allocation  1 . 
The  loss  function  also  tells  us  that  allocations  3  and  4 
are  both  more  equitable  than  allocations  1  or  2,  because 
the  value  of  the  loss  function  is  smaller  for  allocations  3 
and  4  (3.2  and  3.0)  than  for  allocations  1  and  2  (8.0  and 
4.0). 

An  alternative  equity  concept  reflects  disproportion- 
ately greater  concern  with  large  errors  than  small;  recall 
that  under  this  concept  allocation  3  might  be  judged  more 
equitable  than  allocation  4.  A  way  to  react  to  this  concern 
about  large  errors  is  to  consider  the  square  of  the  errors 
rather  than  their  magnitudes.  If  we  consider  overalloca- 
tions  and  underallocations  to  be  equally  inequitable,  we 
can  consider  the  loss  function  defined  to  be  the  sum  of 
the  squares  of  the  errors  in  allocation.  This  loss  function 
takes  the  value  3.84  for  allocation  3  and  4.5  for  alloca- 
tion 4,  so  allocation  3  is  more  equitable. 

Other  equity  criteria  treat  overallocations  and  under- 
allocations differently.  For  example,  the  inequity  of  an 
overallocation  to  a  party  might  be  considered  negligible, 
since  the  partyis  receiving  a  net  gain  in  allocation  and  thus 
suffers  no  harm.  If,  simultaneously,  there  is  extreme 
concern  with  large  underallocations,  then  one  might  com- 
pare the  equity  of  alternative  allocations  on  the  basis  of 
the  sum  of  squares  of  the  underallocations.  The  value  of 
this  loss  function  is  1.3  for  allocation  3  and  2.2  for 
allocation  4,  so  allocation  3  is  more  equitable  under  this 
loss  function  also. 

Example  3  has  illustrated  how  loss  functions  can  be  used 
to  represent  different  concepts  of  equity  of  allocations.  The 
question  of  what  concept  of  equity,  or  equivalently,  what 
loss  function  should  be  used  has  not  yet  been  addressed. 
Therefore,  several  more  complex  concepts  of  equity  will 
be  considered  below. 

Most  methods  for  deriving  loss  functions  to  measure 
equity  or,  equivalently,  for  comparing  the  inequities  of  two 
sets  of  allocations  proceed  in  two  steps  [3,  5,  15,  16] .  First, 
the  inequity  to  each  party  in  the  allocation  process  is  con- 
sidered. Then,  from  these  individual  considerations,  the  over- 
all equity  is  evaluated. 

In  considering  loss  functions  to  represent  more  complex 
concepts  of  accuracy,  we  will  focus  primarily  on  how  to 
evaluate  the  individual  inequities  to  the  different  parties  in 
the  allocation  program.  The  measure  of  overall  equity  will 
be  taken  here  as  the  sum  of  the  individual  inequities.  (Other 
formulations  are  possible  as  well  but  will  not  be  used  here.)2 


In  particular,  we  will  consider  (1)  ways  to  treat  inequity 
from  overallocation  and  underal location  differently,  (2) 
differential  treatment  of  different  parties,  and  (3)  how  to 
consider  equity  for  subgroups  (e.g.,  racial  or  ethnic  groups) 
that  do  not  receive  allocations  directly,  but  share  in  alloca- 
tions to  the  political  jurisdictions  in  which  they  reside. 

Overallocation  versus  Underallocation 

It  is  clear  that  a  positive  inequity  is  associated  with  a 
party  that  is  underallocated  funds  because  of  data  error. 
But  what  of  a  party  that  is  overallocated  funds?  When  a 
fixed  sum  is  being  allocated,  an  overallocation  to  one  party 
implies  underallocation  to  another,  and  thus  overallocations 
contribute  indirectly  to  the  overall  inequity.  However,  since 
the  inequity  from  underallocation  is  reflected  in  the  measures 
of  inequity  for  those  parties  subject  to  underallocation,  the 
inequity  from  overallocation  should  be  considered  separately. 
Since  overallocation  to  a  party  produces  a  net  gain  for  that 
party,  it  is  conceivable  that  an  individual  overallocation  is 
associated  with  negative  disutility. 

Example  4 

To  illustrate  the  difference  in  individual  inequities 
arising  from  overallocation  and  underallocation,  consider 
a  population-based  allocation  of  a  fixed  sum  to  two 
parties,  where  it  is  not  known  who  is  party  1  and  who  is 
party  2.  Suppose  that,  despite  adjustment  of  data,  the  best 
allocation  possible  would  be  overallocating  $1  million  to 
party  1  and  underallocating  $1  million  to  party  2.  Now 
suppose  that  by  spending  an  extra  $1.8  million  on  data, 
the  shares  of  the  fixed  sum  to  be  allocated  could  be  de- 
termined perfectly,  but  this  data  expenditure  would  be 
taken  from  the  fixed  sum  to  be  allocated.  Then  both 
parties  would  be  underallocated  (relative  to  that  original 
fixed  sum)  $0.9  million. 


Party  1 
Party  2 


Error  in  allocation  5 
(with  existing  data) 
(million  of  dollars) 

+1.0 
-1.0 


Error  under  allocation  6 

(with  extra  data) 

(million  of  dollars) 

-0.9 
-   .9 


2  In  effect,  we  are  adopting  a  utilitarian  approach.  Loss  functions 
based  solely  on  the  largest  misallocation  or  largest  underallocation  do 
not  fall  in  the  class  of  loss  functions  we  consider,  but  they  can  be 
approximated  by  loss  functions  we  do  consider. 


Which  allocation  is  more  equitable?  It  depends  upon 
how  the  inequities  from  overallocation  are  perceived. 
Suppose  we  represent  the  overall  inequity  by  the  sum  of 
the  absolute  values  of  the  underallocations,  plus  a  con- 
stant C,  times  the  sum  of  the  overallocations: 

Sum  of  underallocations  +  C  x  sum  of  overallocations 

(Similar  examples  can  be  constructed  for  squared  error  or 
other  loss  functions.)  Then  the  inequity  for  allocation  5 
is  measured  at  1  +  C  and  the  inequity  for  allocation  6  is 
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measured  at  1.8.  In  this  class  of  loss  functions,  allocation 
5  is  judged  to  be  more  equitable  than  allocation  6  only  if 
C  is  less  than  0.8. 

To  push  the  example  further,  suppose  the  extra  data 
would  have  cost  only  $1  million  rather  than  $1 .8  million, 
so  that  allocation  7  would  have  occurred. 


Party  1 
Party  2 


Error  under 

allocation  5 

(million  of  dollars) 

+1.0 
-1.0 


Error  under 

allocation  7 

(million  of  dollars) 

-0.5 
-  .5 


Now  allocation  5  is  more  equitable  than  allocation  7 
(under  the  loss  function  above)  only  if  C  is  negative;  i.e., 
only  if  overallocations  are  treated  as  negative  inequities. 
If  we  are  indifferent  between  the  two  allocations,  then  C 
is  zero,  and  if  allocation  7  is  preferred,  then  C  is  positive. 

Example  4  suggests  that  we  may  want  to  treat  overalloca- 
tions and  underallocations  asymmetrically.  In  particular, 
we  may  want  to  treat  overallocations  to  individual  parties  as 
implying  individual  benefits.  This  consideration  has  in- 
teresting implications  for  the  form  of  the  measures  of  in- 
dividual inequities.  For  example,  consider  the  loss  function 
defined  to  be  proportional  to  the  sum  of  the  squares  of  the 
underallocations,  plus  a  negative  constant  C,  times  the  sum 
of  the  squares  of  the  overallocations: 

Sum  of  squared  underallocations 
+  C  x  sum  of  squared  overallocations  C  <  0 

Such  a  loss  function  may  violate  one  of  our  basic  stipulations 
about  our  concept  of  overall  equity,  viz,  perfect  equity 
occurs  when  the  set  of  allocations  equals  its  set  of  target 
values  and  otherwise  a  degree  of  inequity  occurs.  The  loss 
function  just  proposed  takes  the  value  zero  when  there  are 
no  errors  in  allocation.  But  the  loss  function  can  take  nega- 
tive values  for  some  possible  configurations  of  errors  in  allo- 
cations, implying  that  these  configurations  are  more  equit- 
able than  that  of  no  errors  in  allocations.  For  example,  if 
C=  -0.6,  then  the  value  of  this  loss  function  for  allocation 
3  is-0.256  (=1.28-  .6x2.56).  Loss  functions  taking  negative 
values  are  not  permissible  because  they  do  not  faithfully 
reflect  our  concept  of  statistical  equity.  Statistical  and 
political  concerns  need  to  be  kept  separate.  It  would  be  a 
mistake  to  try  to  use  statistical  adjustments  to  compensate 
for  perceived  shortcomings  in  the  allocation.  As  the  example 
shows,  care  is  needed  in  formulating  loss  functions  to  rep- 
resent concepts  of  equity. 

We  note  in  passing  that  in  a  true  fixed-pie  allocation,  the 
asymmetric  losses  to  individuals  may  combine  to  yield  a 
symmetric  aggregate  loss  function.  Thus,  if  the  loss  function 
is  taken  to  equal  the  sum  of  underallocations,  plus  a  constant 
C,  times  the  sum  of  the  overallocations,  then  the  loss  func- 


tion may  be  reexpressed  as  (1  +  C)/2,  times  the  sum  of  the 
absolute  errors  in  allocation.  (To  see  this,  observe  that  in  a 
fixed-pie  allocation,  the  sum  of  underallocations  and  the  sum 
of  overallocations  are  equal  to  each  other,  hence  equal  to 
half  the  sum  of  the  absolute  errors  in  allocation.) 

Differential  Treatment  of  Different  Parties 

In  practice  there  may  be  a  strong  desire  to  treat  different 
groups  differently.  For  example,  an  underallocation  of  a 
given  dollar  amount  may  be  judged  more  inequitable  for  a 
State  with  a  small  population,  such  as  Vermont,  than  for  a 
State  with  a  large  population,  such  as  California.  One  way  to 
accommodate  differential  treatment  is  to  weight  the  indi- 
vidual inequities  by  different  factors.  That  is,  we  may  con- 
sider loss  functions  defined  to  be  weighted  sums  of  the 
individual  inequities. 

Example  5 

Let  the  subscript  /  refer  to  a  party  (e.g.,  a  State  or  local 
government)  in  the  allocation  process,  let  x-  denote  the 
error  in  the  allocation  to  party  /',  and  let  w-  denote  a 
weight  for  party  /.  First,  consider  the  simple  loss  function 
defined  by: 


where  2  is  read  "the  sum  over  all  party  /     and|  x-  \  is  the 

/  ' 

absolute  value  of  the  error  in  allocation  x-.  To  treat  differ- 
ent parties  differentially,  we  may  consider  the  loss  func- 
tion 


2  w  |  x- 1 

/     ' 


(1) 


For  example,  w-  might  be  taken  inversely  proportional  to 
the  population  of  unit  /.  This  reflects  concern  with  per 
capita  errors  in  allocation.  Or,  w;  might  be  taken  inversely 
proportional  to  the  target  value  for  party  /.  This  reflects 
concern  with  proportional  errors  in  allocation. 

We  may  wish  to  use  different  weights  depending  on 
whether  the  error  in  allocation  is  positive  or  negative.  Let 
/+  =  the  set  of  units  with  overallocations,  /""  =  the  set  of 
parties  with  underallocations,  and  let  w-+  and  wr  be  two 
weights  for  party  /'.  The  numbers  w-+  and  wr,  respec- 
tively, weight  overallocations  and  underallocations  to 
party  /.  We  require  the  weights  wr  to  be  postive  because 
underallocations  imply  positive  inequity.  We  may  consider 
the  loss  function  given  by: 


S     W+x,  +  2_   W.    |  X;  | 


(2) 


where  2,    (or  2_)  is  read  "the  sum  over  all  units  /with 
/+  / 
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overallocations  (or  underallocations)."3  For  example,  one 
might  choose  w-+  inversely  to  population  (to  reflect  con- 
cern about  per  capita  overallocation),  and  one  might 
choose  w,~  inversely  to  the  product  of  population  and  per 
capita  income,  to  reflect  concern  that  per  capita  under- 
allocations are  more  severely  felt  by  parties  with  lower 
per  capita  income. 

As  discussed  in  the  paragraph  following  example  4, 
we  may  want  to  choose  w-+  to  be  negative.  In  this  case, 
however,  our  choice  of  weights  is  constrained.  For  ex- 
ample, suppose  that  for  all  units  /'  the  ratio  of  w-+  to  w~ 
is  a  constant  C  lying  between  0  and  -1.  Then,  if  the  loss 
is  to  take  its  minimum,  zero,  only  when  there  are  no 
errors  in  allocation,  it  is  necessary4  that  the  ratio  of  the 
smallest  w~  to  the  largest  wj~  be  greater  than  the  abso- 
lute value  of  C.  This  is  a  potentially  significant  constraint. 
It  shows  that  the  following  loss  function  is  not  permissible 
for  measuring  the  inequity  of  errors  in  allocations  to 
States: 

Sum  of  per  capita  underallocations  +  C  x  sum  of  per 
capita  overallocations 


group  in  the  State.  If  P.  is  the  true  population  of  sub- 
group s  in  State  /,  P+  is  the  true  total  population  over  all 
States  of  subgroup  s,  P-.  is  the  true  total  population  in 
State  /,  and  P|  i  is  the  true  total  national  population, 
then  the  fraction  of  the  total  national  allocation  that 
should  go  to  State  /  is  P-+/P++.  The  fraction  of  the  total 
national  allocation  that  should  go  to  subgroup  s  in  State 
/'  is 


<W  r>Jp- 


>=v^ 


/+"  ++ 


The  fraction  of  the  total  national  allocation  that  should 
go  to  subgroup  s  is  obtained  by  summing  P-JP++  over  all 
States  /  to  yield 

P.JP.. 


+/P 


Letting  s  refer  to  blacks,  using  estimates  of  7.7  percent 
undercount  for  blacks  and  2.5  percent  undercount  nation- 
wide, the  fraction  of  the  total  national  allocation  for 
blacks  as  a  group  is  estimated  to  be  too  low  by  5.3  per- 
cent, or 


where  C  is  less  than  -0.04  (note  that  the  ratio— the  re- 
ciprocal of  the  largest  State  population  to  the  reciprocal 
of  the  smallest— is  less  than  0.04).  In  formulating  loss 
functions,  constraints  such  as  this  need  to  be  considered. 

Population  Subgroups 

The  recipients  of  allocations  under  Federal  grants-in-aid 
programs  are  typically  State  and  local  governments.  But  con- 
cern for  equity  also  extends  to  racial,  ethnic,  and  other 
groups  who  may  not  receive  allocations  directly,  but  who 
share  in  the  allocations  to  political  jurisdictions.  Thus,  for 
example,  the  disparity  between  the  estimates  of  undercount 
for  blacks  (7.7  percent)  and  whites  (1.9  percent)  suggests 
that,  as  a  group,  blacks  have  been  underallocated  funds 
under  programs  that  use  population  data. 

Example  6 

Consider  an  allocation  program  that  allocates  a  fixed 
sum  of  money  to  States  in  proportion  to  total  population. 
Assume  that  every  subgroup  in  each  State  shares  in  the 
State's  allocation  proportionally  to  the  size  of  the  sub- 


+s 


(1-.01DP. 


+s 


.053  P 


+s 


3This  may  be  reexpressed  as  2  iw/+  x,+  +  wf  xf  ),  where x.+ 
=  max  (xj,  0)  and  xf  =  max  (-*/,  0). 

4  A  simple  example  illustrates  this.  Suppose  a  fixed  amount  is  to 
be  allocated  to  two  parties,  indexed  by  1  and  2,  such  that  party  1  is 
overal located  by  a  positive  amount  X  and  party  2  is  underallocated  by 
X.  Let  the  weights  be  given  by  wl  -  =  1 , iv,  +  =  C,  w2 '-  =  A,  and  w7  + 
=  AC.  Suppose  that  A  is  less  than  1 ,  so  that  w2  -  is  smaller  than  iv,  -. 
The  value  taken  by  the  loss  function  is  clearly  CX  +  AX,  which  is 
greater  than  zero  only  if  A,  the  ratio  of  w2  ~  to  wt~,  exceeds  the 
absolute  value  of  C.   Further  discussion  is  found  in  [15]. 


(1-.025)P, 


++ 


Generally,  the  error  x    in  allocation  to  subgroup  s  arising 

w 

from  errors  x ■  in  allocations  to  States  /  is 

P. 
x  =2  Jlx. 
s     i    P.      ' 


where  P-  is  the  population  of  subgroup  s  in  State  /  and  P.. 
is  the  total  population  of  State  /'.  Measures  of  inequity  for 
individual  subgroups  may  be  formulated  analogously  to  the 
measures  for  other  individual  parties  described  earlier.  Thus, 
possible  measures  of  inequity  for  a  subgroup  s  include 
w  \x  |  and  w  (x  )2,  where  w    is  a  positive  weight.  Another 

So  S      S  o 

possible  measure  is: 

w    x  ifx   is  an  overallocation,  and 

m/_|x|         ifx   is  an  underallocation 
s      s  s 

where  w.  is  a  (possibly  negative)  weight  and  w~  is  a  posi- 
tive  weight. 

An  important  question  is  how  to  combine  the  measures 
of  inequity  for  individual  subgroups  into  a  measure  of  total 
inequity.  For  example,  considering  the  measure  of  individual 
inequity  to  have  the  general  form  w\x\,  is  it  desirable  to  com- 
bine subgroup  inequities  into  a  measure  of  "overall  inequity 

for  subgroups,"  say,  2w  |x  I?  Furthermore,  how  do  we  want 

s    s    s 
to  combine  the  measures  of  inequity  to  subgroups  and  the 

measures  of   inequity   to  parties  (e.g.,  State  governments) 
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participating  directly  in  the  allocation  program?  Do  we  want 
to  add  them;  for  example,  do  we  want  to  use 


Siv-k/l  +  2w |x  | 
,11       s    s    s 


If  not,  how  should  we  simultaneously  minimize  both  meas- 
ures of  overall  inequity  2m/.|x.|  and  2w  |x  |  ?  That  is,  what 

/    '    '  s    s    s 

tradeoffs  do  we  prefer  between  inequities  to  subgroups  s  and 

parties  /  that  directly  participate  in  the  allocation  program? 
In  any  case,  what  subgroups  are  to  be  considered? 

(We  note,  however,  that  the  situation  simplifies 
if  each  subgroup  s  is  contained  wholly  within  one  State  (or 
party)  /  so  that  the  subgroup  may  be  denoted  s(i).  For  ex- 
ample, the  black  population  of  each  State  could  be  con- 
sidered separately  rather  than  as  a  national  group.  In  this 
case,  the  error  in  allocation  to  subgroups  s(i)  is 


P. 

xs(i)      -p—  Xi 
r/+ 


and  if  the  inequity  to  subgroup  s(i)  is  w  ...  \  x  ...  |  ,  then  the 
total  inequity  to  States  (or  parties  generally)  and  to  sub- 
groups can  be  written  as 


'  *  ri+ 


or,   more   simply,   2  w}*  \x  .\,   where  w*  =  w-  +  2  w  ,-, — -• 
j       I        I  i  i       $     SUJ  p.+ 

This  is  of  the  general  form  of  display  (1)  in  the  preceding 
section. 


Equity  of  the  Adjustment  Process 

So  far,  we  have  been  considering  the  equity  in  the  final 
allocations.  Now  we  consider  the  equity  of  the  adjustment 
process  as  a  separate  operation.  Because  the  estimates  of 
undercount  are  subject  to  error,  there  is  a  probability  that 
the  adjustment  will  (1)  cause  some  parties  to  be  underallo- 
cated  whereas  they  otherwise  would  not  be  and  (2)  increase 
the  severity  of  the  underallocation  to  other  parties. 

Situation  (1)  could  occur  if  a  State  whose  relative  under- 
count was  equal  to  or  less  than  that  of  the  Nation  as  a  whole 
was  estimated  to  have  an  even  smaller  relative  undercount 
(compared  to  the  Nation  as  a  whole).  In  this  case,  the  adjust- 
ment could  greatly  reduce  the  State's  allocation.  For  a  State 
in  this  situation,  the  adjustment  would  cause  an  underalloca- 
tion. 

Situation  (2)  could  occur  if  a  State  whose  relative  under- 
count exceeded  that  for  the  Nation  as  a  whole  was  estimated 
to  have  a  relative  undercount  equal  to  or  less  than  that  for 
the  Nation  as  a  whole.  In  this  case,  adjustment  would  in- 
crease the  underallocation  to  the  State. 


Given  the  quality  of  estimates  of  undercount  that  we 
are  likely  to  obtain,  we  can  expect  situations  (1)  and  (2) 
to  arise  in  any  adjustment  process  with  some  probability. 
Concern  over  the  equity  of  the  adjustment  process  might 
lead  us  to  try  to  reduce  the  probability  of  (1)  or  (2)  for  any 
given  party  to  below  a  specified  level.  Here  equity  consider- 
ations serve  as  constraints  on  the  adjustment  process.  In 
contrast,  equity  considerations  that  focused  just  on  the  out- 
comes of  the  allocations  after  adjustment  could  be  repre- 
sented as  loss  functions.  The  two  kinds  of  equity  may  not 
be  compatible,  and  tradeoffs  may  need  to  be  considered. 

Formally,  equity  concerns  about  the  adjustment  process 
can  also  be  represented  by  loss  functions,  and  the  several 
kinds  of  equity  we  have  been  discussing  can  be  jointly  repre- 
sented by  a  multivariate  loss  function.  However,  the  choice 
of  a  minimization  criterion  for  such  a  loss  function  is  equi- 
valent to  the  choice  of  a  tradeoff  among  the  alternative 
measures  of  inequity.  An  extensive  discussion  of  theory  for 
evaluating  tradeoffs  in  this  kind  of  situation  is  given  in  [6] . 

INTERPLAY  BETWEEN  EQUITY  AND 
ACCURACY 

Having  chosen  a  criterion  of  equity  and  specified  a  loss 
function  to  represent  our  notions  of  equity,  we  may  char- 
acterize the  most  equitable,  or  the  optimal,  set  of  population 
estimates  as  that  which  minimizes  the  expected  value  of  the 
loss  function.5  The  difference  between  the  optimal  popula- 
tion estimate  for  an  area  and  the  census  count  is  the  optimal 
adjustment  for  the  area.  The  values  of  the  optimal  adjust- 
ments are  affected  by  both  the  loss  function  used  (i.e.,  the 
chosen  criteria  of  equity  and  preferred  tradeoffs  among 
them)  and  the  probability  distribution  according  to  which 
the  loss  function's  expected  value  (or  expected  loss)  is 
computed. 

The  questions  we  will  consider  include  the  following: 

1.  How  sensitive  are  the  optimal  adjustments  to  different 
notions  of  equity  of  allocations? 

2.  How  sensitive  are  the  optimal  adjustments  to  the 
accuracy  of  the  estimates  of  undercount,  or  equiva- 
lent^, to  the  probability  distribution  with  respect  to 
which  expected  loss  is  computed? 

3.  How  do  considerations  of  equity  of  the  adjustment 
process  affect  the  choice  of  optimal  adjustments?  In 
particular,  how  do  the  probabilities  of  causing  an 
underallocation  or  increasing  the  severity  of  an  under- 


5  As  stated  earlier,  if  the  loss  function  is  vector-valued,  the  criter- 
ion of  minimization  may  be  complicated,  and  must  be  formulated  to 
reflect  our  preference  over  tradeoffs  among  different  notions  of 
equity. 

We  have  not  tried  to  take  uncertainty  considerations  into  account 
in  formulating  the  loss  function,  so  the  loss  function  may  not  strictly 
be  interpreted  as  the  negative  of  a  Von  Neumann-Morgenstern  utility 
function.  Next  to  the  other  problems  with  developing  appropriate 
loss  functions,  this  one  is  surely  minor. 
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allocation  vary 
undercount? 


with  the  accuracy  of  the  estimates  of 


A  few  preliminary  words  about  limitations  of  the  sensi- 
tivity analysis  are  in  order.  These  limitations  pertain  to  the 
formulation  of  the  probability  distribution  with  respect  to 
which  the  expected  value  of  the  loss  function  is  computed. 
Determination  of  this  probability  distribution  involves  assess- 
ment of  the  accuracy  of  the  undercount  estimates,  or  equiva- 
lent^, of  the  alternative  population  estimates.  As  discussed 
in  section  1 ,  there  are  many  sources  of  uncertainty  in  these 
estimates  and  to  some  extent  formulation  of  the  probability 
distributions  will  necessarily  depend  on  the  subjective  be- 
liefs of  experts.  Further  consideration  of  the  formulation  of 
these  probability  distributions  is  beyond  the  scope  of  this 
paper. 

The  most  logical  approach  to  the  problem  of  finding 
the  optimal  adjustments  is  that  of  Bayesian  decision  theory 
[10],  so  that  his  probability  distribution  represents  the 
judgment  of  experts  about  the  probable  values  of  true 
population  sizes,  given  all  available  data.  For  the  present 
purposes  of  illustration,  the  Bayesian  approach  is  too 
complicated.  Instead,  we  will  consider  the  sensitivity  of  the 
optimal  adjustments  under  a  somewhat  ad  hoc  optimiza- 
tion approach  to  a  highly  simplified  model.  This  model 
will  be  easy  to  work  with  and  the  sensitivity  of  the  optimal 
adjustments  derived  from  it  will  provide  at  least  tentative 
insight  into  questions  1  through  3  above. 

Accuracy  and  Equity  of  Allocations 

We  use  the  following  model:  Let  6  ■  =  P/P.  be  the  true 
fraction  of  total  population  belonging  to  party  /,  and  con- 
sider two  estimates  of  9 .,  that  derived  from  the  census 
counts,  C-,  and  that  derived  from  alternative  methods  (e.g., 
demographic  analysis  or  synthetic  methods),  E..  We  assume 
that  E-  is  normally  distributed  with  mean  0.  and  variance 
V2 .  The  errors  in  the  census  shares  are  modeled  as  fixed 
biases  with  negligible  variances  (and  negligible  covariances 
with  Ej),  so  that  the  expectation  of  C  is  0;-  —  U-  and  the 
variance  of  C-  is  small  enough  to  ignore.  We  interpret  U ■ 
as  the  differential  undercount.  To  optimally  estimate  6  ■, 
we  consider  choosing  the  weighted  average  of  C-  and  E. 
that   minimizes  the   expected   value   of  the   loss  function. 

First,  we  consider  the  weighted  square-error  loss.  Writing 
the  weighted  average  as  V-  =  fC.  +  (1  -  f.)  E.  we  want  to 
choose  fj  to  minimize  the  expectation  of  Zw.  (Y.  -  6  J2 . 
If  the  parties  are  States,  so  that  2  6  •  is  known  to  be  1,  we 
may  wish  to  impose  the  constraint6  that  2  Y .  =  1 .  Under  the 
above  assumptions,  the  desired  values  of  f  ■  are  given  by 


V? 


V2  +  u? 


X(Cr  Ejilw. 

v2  +  u? 


(5) 


Illustrative  calculations  are  discussed  below.  To  use  the 
optimal  weights  f*,  estimates  of  U-  and  V-  are  needed.  To 
estimate  V-,  we  would  ideally  use  the  difficult  techniques 
discussed  much  earlier;  for  the  calculations  presented  below 
a  range  of  values  from  .05  U-  to  5.0  U-  were  considered.  To 
estimate  U '•  we  will  use  the  estimates  of  differential  under- 
count  given  by  U-  =  E.  -  C.  Thus,  in  practice  (for  this  ex- 
ample), the  weights  f.  are  random.  For  the  calculations, 
three  sets  of  E '.  for  States  were  used,  corresponding  to  the 
estimates  in  SPRR  (table  f-1,  cols.  1  and  13,  and  table  VI  l-D, 
col.  7)  denoted  "SOR-3-1  WCF-1  BACF-1,"  "basic  synthetic 
(age,  race,  sex),"  and  "composite-2."  Four  sets  of  weights 
were  used :  w.  =  1 1C-,  w.=  1 ,  w.  =  C,  and  w.  =  C2 . 

The  first  point  worth  noting  is  that  the  values  of  f*  were 
determined  by  the  first  of  the  two  terms  on  the  right-hand 
side  of  equation  (5).  The  term  involving  X  was  always  smaller 
than  10~6,  whereas  f*  ranged  from  0.0025  to  0.96.  Thus, 
the  optimal  values  of  f*  essentially  are  given  by  the  first 


term  on  the  right  of  equation  (5). 
we  may  reexpress  (5)  as 


Ignoring  the  second  term, 


1 


'/* 


1 


1    +     V;     IU. 


(6) 


As  the  relative  variance  of  the  estimate  of  differential  under- 
count,   V2/U-2 ,   increases,   the   weight     1  -    f*  decreases. 
Some  values  of  1  -  f*  and  VJU-are  shown  below. 

V/Uj    =0.05     0.1     0.4     1.0     1.4     2.0     3.0     4.0     5.0 

1  -  f*  =  0.998  0.99  0.86  0.50  0.34  0.20  0.10  0.06  0.04 

Thus,  the  believed  accuracy  of  the  undercount  estimates  has 
a  great  effect  upon  the  choice  of  optimal  adjustments.8 
Notice,  however,  that  even  when  the  available  estimates  of 
undercount  are  highly  inaccurate  (V-/U-  at  least  3.0),  the 
census  figures  are  still  adjusted  somewhat. 


6  The  Bayesian  approach  would  not  require  this  constraint  to  be 
separately  imposed. 

-1  +  2  ( V?  Cj  +  U/2  Ej)  I  ( V2  +  Uj2 ) 


7  Choose  \ 


£  IC.-Ej)2  l[wi(Vj2  +U/2)] 


where  the  constant  X  is  chosen7  so  that  2  Y .  =  1 . 


Note  that  the  weights  fj*  as  given  by  (5)  are  really  random,  de- 
pending on  C.  and  E ..  One  alternative  way  to  constrain  S  Y\  =  1  is  to 
choose  f* ,  disregarding  the  constraint— in  this  case  fj*  is  given  by  the 
first  term  on  the  right-hand  side  of  (5)— and  then  to  divide  each  Y\  by 
£V/.  This  is  intuitively  less  appealing  than  (5).  For  example,  if  E; 
and  Cj  are  equal,  then  using  fj*  from  (5)  we  have  //  =  Cj  =  Ej,  but  if 
we  simply  divide  each  //  by  £  //,  then  V;  ¥=  Cj  -  Ej. 

"This  conclusion,  of  course,  depends  on  the  models  used  above  and 
on  the  optimization  approach  adopted.  A  relationship  between  the 
Bayesian  approach  and  the  present  approach  is  easily  seen  in  the  uni- 
variate case.  Under  the  approach  above,  the  optimal  weight  was  f* 
=  V2 KV2  +  U2).  To  consider  a  Bayesian  approach,  let  E  (condi- 
tionally on  C)  be  normally  distributed  with  mean  6  and  variance 
y2,  let  C  be  normally  distributed  with  mean  6  and  variance  U2 , 
and  let  0  be  distributed  with  large  variance  compared  to  U2  and 
V2 .  Then  the  posterior  mean  for  0  is  f*C  +  (1  -  f*)U  where  f*  =  V2 1 
(V2  +  U2 ),  the  same  as  the  optimal  Y  derived  above. 
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The  fact  that  the  second  term  on  the  right-hand  side  of 
equation  (5)  was  negligible  implies  that  the  weights  w- 
in  the  loss  function  are  unimportant  for  determining  the  op- 
timal adjustments.  This  has  implications  for  the  equitable 
treatment  of  subgroups  of  the  population,  as  previously 
discussed.  If  subgroups  of  the  population  living  in  different 
States  are  treated  as  different  subgroups,  then  concern  about 
equity  of  allocations  to  these  subgroups  can  be  reflected  by 
modifying  the  weights  w-  for  the  States  in  which  they  re- 
side. (See  parenthetical  note  at  end  of  the  Population  Sub- 
groups). However,  the  insensitivity  of  the  optimal  weights 
f*  to  variation  in  the  weights  w-  indicates  that  equity  in 
allocations  to  these  subgroups  can  best  be  achieved  by  maxi- 
mizing the  equity  in  the  allocations  to  the  States,  that  is,  by 
choosing  adjustments  to  minimize  the  loss  function  for  errors 
in  allocations  to  States. 

If  subgroups  of  the  population  living  in  different  States 
are  not  treated  separately  but  instead  are  treated  as  aggre- 
gates, the  situation  is  more  complex,  but  I  conjecture  that  a 
similar  result  holds.  That  is,  the  optimal  adjustments  derived 
by  considering  equity  in  allocations  to  States  and  to  sub- 
groups will  be  identical  to  those  derived  by  considering 
equity  in  allocations  to  States  alone.  Of  course,  this  con- 
clusion  needs   to   be   tested    under   other    models   as   well. 

Next,  we  consider  the  sensitivity  of  the  optimal  allocations 
to  changes  in  the  form  of  the  loss  function.  An  alternative 
to  the  weighted  squared-error  loss  function  is  the  weighted 
absolute-error  loss  function  2w-  I  Y.  -  6X  \  .  The  optimal 
weights  f*  under  this  loss  function  do  not  have  tidy  ex- 
pressions9 like  those  for  squared-error  loss  (5),  but  they 
too  depend  only  on  V?/U.2  (if  the  constraint  that  2V.=1  is 
not  imposed).  Some  values  are  shown  below. 


V/U 

f*  squared  error 
f*  absolute  error 
difference 


2.0  1.4     1.0     0.7     0.4     0.3     0.25 

0.15  0.1 

.80  .67     .50     .34     .15     .09     .06 

.02  .01 

.70  .59     .44     .29     .14     .08     .05 

.02  .01 

.10  .08     .06     .05     .01      .01      .01 

.00  .00 


magnitude  of  the  optimal  adjustment.  This  suggests  that  a 
convention  for  equity  in  allocations  can  be  agreed  upon. 
On  the  other  hand,  the  optimal  adjustments  are  sensitive 
to  the  believed  accuracy  of  the  estimates  of  undercount. 
Obtaining  estimates  of  accuracy  is  difficult  and  may  give 
rise  to  controversy  among  statisticians.1  ° 


Accuracy  and  Equity  of  the  Adjustment  Process 

Under  the  models  considered  so  far,  we  do  not  fully  ad- 
just for  estimated  undercount  but  adjust  by  a  fraction  1  -  f*. 
It  is  important  to  realize  that  this  does  not  necessarily  under- 
adjust  for  areas  whose  undercount  is  more  severe  than  the 
national  average.  That  is,  under  the  optimal  adjustments  as 
derived,  there  is  a  probability  of  overadjusting. 

To  see  this,  we  consider  the  optimal  adjustments  (6). 
Recall  that  the  optimal  weights  f*  are  estimated  on  the 
basis  of  estimates  of  U  and  V.  Suppose  V  is  known  but  U  is 
estimated  by  U,  the  estimated  differential  undercount 
(U  =  E.  -  C-).  Thus,  the  optimal  adjustment  for  a  State  may 
be  expressed  as 


U 


& 


+  vz 


(7) 


Suppose  U  is  positive,  so  that  the  State  was  undercounted 
more  severely  than  the  Nation  as  a  whole.  Then  the  optimal 

adjustment  (7)  overadjusts  if       U         exceeds  U.  Under  the 

LP  +  V2 
distribution  we  have  been  considering  for  E ,  U  is  normally 
distributed  with  mean  U  and  variance  V2 .  The  probability 
of  overadjusting  may  thus  be  represented  as 


Prob  [Z  (Z  +  U/V)2  -  U/V  >  0] 


(8) 


where  Z  =  (U  -  U)/V  has  a  standard  normal  distribution. 
These  probabilities  are  shown  below  for  various  values  of 
V/U.  (The  probability  of  overadjusting  does  not  depend  on 
the  sign  of  U.) 


For  all  the  values  of  V/U  considered  (note  we  have  dropped 
the  subscript  /'),  the  optimal  adjustments  under  the  two  loss 
functions  are  close.  The  optimal  adjustments  under  absolute- 
error  loss  uniformly  give  more  weight  to  the  alternative  esti- 
mate, but  for  values  of  V/U  less  than  1,  the  difference  is  slight. 
To  sum  up,  the  choice  of  weights  for  the  loss  function 
and  the  form  of  the  loss  function  have  little  effect  on  the 


'The    optimal  f*  for  this  loss  function  are  the  solutions  to 

R  (<P(Q*R)  -  0.5)  =4>(Q*R) 

where  R  =  U/V,  f*  =  Q*/(1  +  <2*),and  <t>  and<£  are  the  normal  cumu- 
lative distribution  function  and  probability  density  function  respec- 
tively (and  we  do  not  require  2  V;  =  1 ) . 


V/U 


Probability  of  overadjusting    = 


.05  .10     .20  .40     .60 

1.0  2.0     3.0  5.0 

.48  .46     .43  .38     .35 

.32  .31      .31  .32 


For  large  values  of  V/U  (1  or  greater),  the  probability  of 
overadjusting  is  fairly  constant.  For  values  of  V/U  less  than 
1,  the  optimal  rule  is  "conservative"  in  the  sense  that  the 
probability  of  overadjusting  decreases  as  the  accuracy  of  the 


10 Thus  (see  footnote  8,  above),  in  the  Bayesian  model  corre- 
sponding to  our  example  the  distribution  of  C  was  taken  to  be  normal 
with  mean  6  and  variance  U2 ,  but  the  compelling  reasons  for  this 
particular  distributional  assumption  are  hard  to  find. 
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undercount  estimate  decreases.  For  all  the  values  of  V/U 
considered,  the  probability  exceeds  0.3. 

This  has  possible  implications  if  one  of  the  criteria  of 
equity  is  that  adjustment  should  not  cause  a  party  that 
would  not  be  underal located  funds  under  the  unadjusted 
census  data  to  be  underallocated  funds  under  the  adjusted 
figures,  or  more  briefly  put,  adjustment  should  not  cause 
harm.  This  notion  of  equity  might  be  interpreted  to  mean 
that  for  any  State,  the  probability  that  adjustment  causes 
harm  should  be  less  than  some  specified  threshold  value. 
If  the  threshold  is  less  than  0.3,  it  might  be  desirable  to 
shrink  the  optimal  weights  1  -  f *  even  further  for  States  with 
negative  U.  Unfortunately,  since  the  sum  over  States  of 
f*U  must  equal  zero,  this  would  imply  that  the  weights 
1  -  f*  for  States  with  positive  U  must  also  be  shrunk.  This 
would  increase  the  inequity  in  the  allocations. 

Next,  we  consider  the  probability  that  adjustment  will  be 
in  the  wrong  direction  entirely.  If  U  is  negative,  then  adjust- 
ment in  the  wrong  direction  increases  the  underallocation. 
For  the  weighted-average  rules  we  have  been  discussing,  this 
probability  is  approximately  the  probability  that  U  is  greater 
(less)  than  0,  given  that  U  is  less  (greater)  than  O.This  prob- 
ability depends  on  the  absolute  value  of  the  ratio  V/U  and 
is  shown  below. 


V/U 

Probability  of  adjusting  in 
wrong  direction 


=      .20     .40  .60     .80     1.0 

2.0  3.0  5.0 

=      .00     .01  .05     .11      .16 

.31      .37  .42 


It  is  plausible  that  the  absolute  value  of  V/U  is  larger  for 
areas  with  U  close  to  zero,  and  thus  adjustment  by  the 
weighted  average  rule  will  increase  the  harm  to  areas  with 
small  differential  undercount  (U  slightly  less  than  zero) 
with  higher  probability  than  for  areas  with  large  differential 
undercount  (U  much  less  than  zero).  However,  since  the 
probabilities  of  adjusting  in  the  wrong  direction  do  not 
depend  on  f* ,  they  cannot  be  altered  by  shrinking  f*.  If  one 
of  the  equity  considerations  is  that  adjustment  will  not  in- 
crease the  severity  of  underallocation  to  any  party,  then  a 
possible  approach  would  be  as  follows:  For  any  State  for 
which  the  probability  of  adjusting  in  the  wrong  direction  ex- 
ceeds a  threshold  value,  the  adjustment  will  be  scaled  down 
to  force  the  expected  increase  in  underallocation  below  a 
specified  level. 

This  would  have  the  same  kind  of  effect  noted  above 
for  the  equity  notion  that  adjustment  should  not  cause 
harm.  These  considerations  of  equity  of  the  adjustment 
process  imply  that  less  funds  should  be  reallocated  under 
adjustment  than  would  be  optimal  under  the  considerations 
of  equity  in  allocations  alone.  Further  work  is  needed  to 
explore  the  quantitative  implications  of  these  opposing 
equity  considerations.  Political  discussion  is  also  needed  to 
determine  whether  there  exist  substantial  concerns  over 
equity  in  the  adjustment  process,  as  described. 


OTHER  CONSIDERATIONS 

Thus  far,  we  have  been  addressing  the  issue  of  equity  in 
a  narrow  sense,  disregarding  statistical  design  questions  such 
as  how  much  data  accuracy  is  needed,  how  should  resources 
be  allocated  to  improve  the  data,  how  much  money  should 
be  spent  on  collecting  and  analyzing  the  data,  and  should 
adjustment  be  performed? 

Loss  functions  and  decision  theory  are  useful  for  design- 
ing the  entire  census  effort,  and  considering  the  census  as  a 
whole  lends  perspective  for  considering  undercount  adjust- 
ment. Decision  theory  tells  us  to  consider  the  costs  and  bene- 
fits of  alternative  data  programs  and  choose  the  data  program 
that  maximizes  the  difference  between  expected  benefit 
and  cost  [15]  . 

Costs  include: 

•  data  collection  and  analysis 

•  political  cost  if  the  census  is  not  adjusted  for  under- 
count 

•  a  smaller  political  cost  if  the  census  is  adjusted 

•  additional  administrative  costs  if  two  sets  of  books  are 
needed 

•  other  costs 

Benefits  include: 

•  more  or  less  quantifiable  benefits  from  more  accurate 
allocations  of  funds  and  congressional  apportionment 

•  other  benefits  for  the  public  and  private  sectors,  less 
studied  and  at  this  point  less  quantifiable 

More  study  is  needed  to  assess  the  magnitudes  of  the  benefits 
that  can  arise  from  better  data  and  then  to  combine  the 
different  measures  of  benefit  into  one  loss  function.  In  par- 
ticular, in  considering  loss  functions  to  represent  inequity  in 
fund  allocations,  we  have  been  using  the  simple  assumption 
that  errors  in  allocations  are  roughly  proportional  to  differ- 
ential errors  in  population  figures.  There  are  many  allocation 
programs  and  they  utilize  population  data  in  different  ways. 
How  should  a  single  loss  function  (possibly  a  weighted  sum 
of  separate  loss  functions,  as  in  [8]  be  devised  to  at  least 
approximately  reflect  these  different  uses  of  the  data?  How 
sensitive  will  optimal  adjustments  be  to  these  kinds  of 
variations  in  the  low  function? 

In  this  larger  benefit-cost  framework,  we  may  still  use  loss 
functions  to  measure  equity,  but  now  we  must  assign  a  dollar 
value  to  inequity  from  errors  in  allocation  of  funds.  For  ex- 
ample, if  inequity  is  measured  by  a  constant,  times  the  sum 
of  absolute  errors  in  allocations  to  States,  how  big  is  the 
constant.  To  reduce  errors  in  allocation  by  $100  million,  is 
it  worth  spending  $10  million,  $1  million,  or  $100,000?  In 
general,  the  optimal  decisions  for  the  benefit-cost  problem 
will  be  far  more  sensitive  to  the  form  of  the  loss  function 
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than   were   the   optimal    adjustments   for   undercount   con- 
sidered above. 

Because  of  difficulties  in  quantifying  benefits  and  costs, 
decision  theory  is  not  a  sufficient  basis  for  planning  a  census, 
but  the  analysis  does  sharpen  understanding  of  the  merits  of 
alternative  programs,  including  programs  that  adjust  for 
undercount  versus  programs  that  do  not  adjust,  and  post- 
enumeration  survey  operations  and  demographic  analysis 
operations  of  different  magnitudes. 

SUMMARY  AND  CONCLUSIONS 

Two  kinds  of  equity  have  been  considered,  equity  in 
allocations  and  equity  in  the  adjustment  process.  Concepts 
of  equity  in  allocations  can  be  represented  by  decision- 
theoretic  loss  functions.  Care  is  needed  to  ensure  that  the 
loss  functions  represent  our  concepts  of  equity  and  also 
make  sense  statistically.  Examples  were  discussed.  Optimal 
adjustments  that  minimize  the  expected  value  of  the  loss 
function  can  be  found.  Under  the  simple  models  considered, 
the  optimal  adjustments  were  insensitive  to  the  form  of  the 
loss  function.  If  this  finding  extends  to  more  complicated 
models,  it  should  diminish  debate  about  how  to  construct 
the  loss  function.  In  particular,  to  maximize  equity  in  the 
allocations  to  ethnic  and  other  subgroups  of  the  population, 
we  need  not  consider  these  subgroups  explicitly  in  devising 
a  loss  function,  but  should  focus  instead  on  governmental 
units  to  which  allocations  are  made  directly. 

Equity  in  the  adjustment  process  concerns  the  likelihood 
that  the  underallocations  to  some  areas  will  be  increased 
by  adjustment  (because  of  errors  in  the  estimates  of  under- 
count) and  that  some  areas,  which  would  not  be  underallo- 
cated  under  the  original  census  data,  will  be  underallocated 
under  the  adjusted  data.  These  equity  considerations  imply 
optimal  adjustments  different  than  the  optimal  adjustments 
for  equity  in  allocations.  Tradeoffs  among  the  different 
criteria  of  equity  need  to  be  considered.  More  work  is 
needed  to  examine  the  sensitivity  of  the  optimal  adjustments 
to  different  tradeoffs. 

Estimation  of  the  accuracy  of  the  undercount  estimates 
may  be  more  important  in  affecting  the  adjustments  than 
the  concepts  of  equity  applied.  Such  estimation  will  be  diffi- 
cult and  needs  attention. 

In  conclusion,  decision-theoretic  methods  can  be  used 
to  adjust  the  census  for  undercount.  More  work  is  needed 
for  formulating  loss  functions,  but  the  optimal  adjustments 
do  not  seem  to  be  sensitive  to  the  loss  function  adopted. 
More  work  is  also  needed  to  assess  the  accuracy  of  the  under- 
count estimates,  but  this  work  is  necessary  for  any  statis- 
tically solid  approach  to  be  used.  The  main  disadvantage  of  a 
decision-theoretic  approach  is  that  it  may  be  difficult  to  ex- 
plain to  the  public  the  rationale  for  the  adjustments.  I  do 
not  find  this  disadvantage  overwhelming,  noting  that  season- 
ally adjusted  price  indexes  and  labor  force  estimates  are 
widely  used,  and  that  the  postcensal  estimates  of  population 


and    per  capita    income    used    for   general    revenue   sharing 
purposes  are  incredibly  complicated. 

There  are  numerous  advantages  to  the  decision-theoretic 
approach. 

•  Bayesian  decision  theory  forms  a  coherent  framework 
for  combining  different  kinds  of  information,  e.g., 
demographic  analysis  and  dual  systems  estimates. 

•  The  flexibility  of  the  approach  allows  us  to  consider 
different  kinds  of  effects  of  errors  in  data;  e.g.,  errors 
in  allocations  to  subgroups  who  do  not  receive  alloca- 
tions directly.  It  also  allows  for  appropriate  uses  of 
estimates  of  undercount  that  are  accurate  for  some 
areas  and  inaccurate  for  other  areas. 

•  Statistical  work  is  somewhat  protected  from  political 
debate. 

•  The  decision-theoretic  approach  gives  the  most  accur- 
ate final  population  figures  (where  the  loss  function 
serves  as  the  measure  of  accuracy). 

REFERENCES 

1.  Bishop,  Y.  M.  M.,  Fienberg,  S.  E.,  and  Holland,  P.  W. 
Discrete  Multivariate  Analysis.  Cambridge,  Mass.:  MIT 
Press,  1975. 

2.  Black,  H.  C.  Black's  Law  Dictionary,  4th  Edition.  St. 
Paul,  Minn:  West  Publishing  Co.,  1968. 

3.  Ferreira,  J.  "Identifying  Equitable  Insurance  Premiums 
for  Risk  Classes:  An  Alternative  to  the  Classical  Ap- 
proach." Division  of  Insurance,  Commonwealth  of 
Massachusetts,  Automobile  Insurance  Risk  Classifica- 
tion: Equity  and  Accuracy,  74-120.  Boston:  Massa- 
chusetts Division  of  Insurance,  1978. 

4.  Hill,  R.  B.,  and  Steffes,  R.  B.  "Estimating  the  1970 
Census  Undercount  for  State  and  Local  Areas."  National 
Urban  League  Data  Service,  Washington,  D.C.  1973. 

5.  Jabine,  T.  B.  "Equity  in  the  Allocation  of  Funds  Based 
on  Simple  Data."  U.S.  Bureau  of  the  Census,  Small - 
Area  Statistics  Papers,  Series  GE-41,  No.  3.  "Conference 
on  Small-Area  Statistics,  Boston,  1976."  Washington, 
D.C:    U.S.    Government    Printing   Office,    1977,    2-8. 

6.  Keeney,  R.  L.,  and  Raiffa,  H.  Decisions  with  Multiple 
Objectives:  Preferences  and  Value  Tradeoffs.  New  York: 
John  Wiley  and  Sons,  1976. 

7.  Keyfitz,  N.  "Information  and  Allocation:  Two  Uses  of 
the  1980  Census."  The  American  Statistician  33,2 
(1979),  45-50. 

8.  Kish,  L.  "Optima  and  Proxima  in  Linear  Sample  De- 
sign." Journal  of  the  Royal  Stat.  Society,  A  139(1976), 
80-95. 

9.  Marks,  E.  S.,  Seltzer,  W.,  and  Krotki,  K.  J.  Population 
Growth  Estimation.  New  York:  The  Population  Council, 
1974. 

10.  Raiffa,  H.,  and  Schlaifer,  R.  Applied  Statistical  Decision 
Theory.  Cambridge:  Harvard  University  Press,  1961. 

11.  Robinson,  J.  G.,  and  Siegel,  J.  S.  "Illustrative  Assess- 


216 


ment  of  Census  Underenumeration  and  Income  Under- 
reporting on  Revenue  Sharing  Allocations  at  the  Local 
Level.  To  appear  in  Proceedings  of  the  ASA,  Social  Sta- 
tistics Section,  1979. 

12.  Savage,  I.  R.,  and  Windham,  B.  "Effects  of  Bias  Removal 
in  Official  Use  of  United  States  Census  Counts."  The 
Florida  State  University,  Department  of  Statistics, 
Tallahassee,  Fla.,  1973. 

13.  U.S.  Department  of  Commerce,  Bureau  of  the  Census. 
"Coverage  of  Population  in  the  1970  Census  and  Some 
Implications  for  Public  Programs,"  by  J.S.  Siegel.  Cur- 
rent Population  Reports,  Series  P-23,  No.  56.  Washing- 
ton, D.C.:  U.S.  Government  Printing  Office,  1975. 

14. "Developmental  Estimates  of  the  Coverage  of  the 


Population  in  the  1970  Census:  Demographic  Analysis," 
by  J.S.  Siegel,  et  al.  Current  Population  Reports,  Series 
P-23,  No.  65.  Washington,  D.C.:  U.S.  Government 
Printing  Office,  1977. 

15.  Spencer,  B.  "Benefit— Cost  Analysis  of  Data  Used  To 
Allocate  Funds:  General  Revenue  Sharing."  Unpub- 
lished doctoral  dissertation,  Yale  University  Statistics 
Dept.,  1979. 

16.  Stanford  Research  Institute.  General  Revenue  Sharing 
Data  Study,  4  vols.  Menlo  Park,  Calif.:  Stanford  Re- 
search Institute,  1974. 

17.  Strauss,  R.  P.,  and  Harkins,  P.  B.  'The  Impact  of  Popula- 
tion Undercounts  on  General  Revenue  Sharing  Alloca- 
tions in  New  Jersey  and  Virginia."  National  Tax  Journal 
XX VII  (1974),  617-624. 


Comments 


Harry  V.  Roberts 

University  of  Chicago 


INTRODUCTION 

The  papers  by  Drs.  Fellegi  and  Spencer  provide  thoughtful 
and  stimulating  examinations  of  ways  in  which  considera- 
tions of  equity  bear  on  census  reporting  of  undercount. 
Fellegi  concentrates  on  the  decision  of  whether  or  not  to 
adjust  the  census  "headcount"  in  the  light  of  available 
information  about  undercount.  (The  quotes  around  "head- 
count"  are  to  remind  the  reader  that  there  are  certain 
departures  from  a  literal  headcount  in  obtaining  the  number 
so  described.)  His  approach  is  based  on  a  particular  loss 
function  that  provides  mathematical  expression  of  equity. 
Spencer  discusses  a  variety  of  loss  functions,  stresses  the 
practical  implications  of  these  loss  functions  in  terms  of 
various  ideas  about  equity,  and  outlines  the  interplay 
between  loss  functions  and  probabilistic  expressions  of  data 
inaccuracy  when  one  minimizes  expected  loss,  as  required  by 
Bayesian  decision  theory.  Both  provide  discussions  of  data 
inaccuracy.  Fellegi  describes  Canadian  studies  that  illustrate 
the  types  of  statistical  information  that  can  be  used  in 
estimating  undercount.  He  also  offers  suggestions  on  the 
optimal  design  of  such  studies.  Explicit  Bayesian  ideas  are 
more  conspicuous  in  Spencer's  paper,  but  both  authors  are 
mindful  of  the  need  for  certain  judgments  about  data 
inaccuracy  if  a  reasoned  approach  to  adjustment  is  to  be 

achieved. 

Both  papers  have  contributed  substantially  to  my  under- 
standing of  the  issues  of  undercount.  Both  reach  general 
conclusions  that  appear  reasonable.  Fellegi,  citing  a 
precedent  but  making  no  special  argument,  specifies  a 
symmetrical  squared-error  estimation  loss  function  as  a 
quantitative  expression  of  the  concept  of  equity.  He 
addresses  inequity  alleviated  or  caused  by  attempts  to  adjust 
headcounts  in  the  light  of  evidence  bearing  on  undercount. 
The  latter  is  expressed  in  terms  of  a  point  estimator,  possibly 
subject  to  uncertain  sampling  bias,  and  an  accompanying 
standard  error.  He  formulates  the  decision  to  adjust  or  not  to 
adjust  as  a  hypothesis-testing  problem.  Under  certain 
reasonable  assumptions  about  the  sampling  properties  of  the 
point  estimator,  and  if  sample  sizes  are  not  too  small,  his 
procedure  is  likely  to  point  in  the  direction  of  adjustment. 

Spencer,  like  Fellegi,  considers  the  equitable  aspects  of 
the  decision  of  whether  or  not  to  adjust.  His  major  attention, 
however,  is  focused  on  how  adjustments  should  be  made, 
that  is,  on  how  to  estimate  the  undercount  by  reliance  on  a 
loss  function  that  expresses  equity.  He  concludes  that  the 
results  are  relatively  insensitive  to  the  precise  specification  of 
loss  functions. 


In  my  discussion,  I  shall  argue  that  from  the  professional 
perspective  of  statistics,  I  see  virtually  no  controversial 
questions  of  equity  but  enormous  technical  problems  of 
estimation.  In  developing  my  reasoning,  I  find  the  distinction 
between  statistical  and  political  equity  suggested  by  Robert 
Hill  to  be  most  useful.  In  the  undercount  context,  political 
equity  is  expressed  in  allocation  formulas  of  Congress,  while 
statistical  equity  concerns  the  appropriate  inputs  to  these 
formulas  when  the  actual  inputs  presumably  intended  by 
Congress  (for  example,  true  population)  are  not  available. 
Hence  we  have  the  focus  of  this  conference  on  undercount, 
the  discrepancy  between  what  Congress  presumably  had  in 
mind  and  the  readily  available  headcount. 

Statistical  equity  is  no  different  in  estimation  of  under- 
count than  in  estimation  of  any  other  quantity  of  interest  to 
public  decision  makers,  such  as  the  percentage  unemploy- 
ment. The  question  is,  "Given  certain  evidence,  how  do  we 
make  the  point  estimate?"  Whether  or  not  statisticians  have 
viewed  these  problems  in  the  decision-theoretic  framework, 
they  have  implicitly  proceeded  as  if  a  symmetric  loss 
function  were  appropriate  in  estimation.  For  optimal  design 
of  such  studies,  one  would  in  principle  want  to  go  further 
than  specification  of  symmetry  and  one  would  have  to  assess 
numerical  parameters  of  the  loss  functions.  For  estimation, 
given  available  evidence,  symmetry  is  enough.  Thus,  although 
I  argue  for  an  absolute-error  loss  function  instead  of  the 
squared-error  loss  function  preferred  by  Fellegi,  this  is  a 
detail. 

Estimation  of  undercount  can  be  compared  with  esti- 
mation of  various  components  of  nonsampling  error  in 
Government  surveys.  Just  as  statisticians  have  not  estimated 
undercount  except  in  postcensus  research,  they  do  not 
usually  estimate  nonsampling  errors  in  other  surveys, 
although  they  may  do  research  on  these  errors  and  present 
general  conclusions  about  their  nature  and  possible  extent. 
The  reason  for  such  practices  is  to  be  found  not  in  questions 
of  equity  but  in  the  nature  of  the  evidence  and  the  available 
statistical  tools.  There  is  a  strong  inclination  by  statisticians 
to  estimate  only  quantities  for  which  there  is  general 
professional  consensus  as  to  how  the  estimates  should  be 
made.  If  different  statisticians  confronted  with  the  same 
evidence  would  come  up  with  widely  varying  estimates,  there 


/  am  indebted  to  Bruce  Spencer  and  Arnold  Zellner 
for  helpful  comments  on  drafts  of  this  discussion. 
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would  be  reluctance  to  quote  any  estimate  at  all.  This 
attitude  indeed  distinguishes  statisticians  from  other 
professionals,  such  as  intelligence  analysts  or  economic 
forecasters,  who  are  expected  to  make  estimates  even  when  it 
is  understood  that  no  professional  consensus  may  exist. 

The  conference  has  heard  considerable  discussion  about 
the  desirability  of  simplicity  of  estimation  procedures.  To 
the  extent  that  simplicity  reflects  the  desire  for  parsimonious 
statistical  models  whenever  possible,  simplicity  is  a 
desideratum.  But  the  main  desideratum  is  professional 
consensus.  The  estimates  of  unemployment  and  the  seasonal 
adjustments  thereof  are  far  from  simple,  but  statisticians 
agree  that  they— or  closely  similar  estimates— are  technically 
defensible. 

PUBLICATION  OF  TWO  SETS  OF  BOOKS 

Before  beginning  my  central  argument,  I  pause  to  mention 
a  practical  suggestion  stimulated  by  the  papers  of  Fellegi  and 
Spencer.  The  idea  is  implicit  also  in  other  writings  on 
undercount,  and  Dan  Melnick  has  reminded  me  that  it  was 
actually  adopted  in  the  wording  of  the  1977  proposed  census 
bill,  H.R.  8871. 

I  applaud  the  focus  of  both  authors  on  undercount  and 
statistical  estimates  of  undercount,  as  opposed  to  adjusted 
estimates  of  population,  because  this  focus  suggests  a  simple 
resolution  to  one  aspect  of  the  controversies  about  census 
adjustment.  It  has  been  said  that  if  census  counts  are 
adjusted,  there  would  be  a  need  to  publish  two  sets  of  census 
numbers,  the  unadjusted  headcounts  and  the  adjusted 
population  estimates,  and  this  practice  would  be  confusing. 
Confusion  can  be  almost  entirely  avoided  by  the  simple 
expedient  of  publishing  headcounts  and  estimates  of  under- 
count; the  latter  down  to  whatever  levels  of  disaggregation 
may  be  determined.  Then  a  user  desiring  an  adjusted  estimate 
would  have  to  perform  the  arithmetic  of  combining  the 
headcounts  and  the  undercount  estimates,  and  hence  would 
be  forced  to  see  the  precise  nature  and  extent  of  the 
adjustment,  something  easily  forgotten  when  adjusted 
numbers  are  conveniently  available.  This  publication  policy 
would  also  palliate  the  problem  posed  by  later  revisions, 
since  the  headcounts  themselves  would  remain  unchanged 
even  if  the  undercount  estimates  were  later  to  be  revised,  say 
in  the  light  of  demographic  analyses  that  would  not  become 
available  as  early  as  other  indicators  of  undercount.  Con- 
fusion and  careless  interpretation  would  be  reduced. 

I  regard  the  term  "adjustment,"  which  is  by  now 
hopelessly  entrenched,  as  unfortunate  from  a  semantic  point 
of  view.  "Estimate  the  undercount"  correctly  suggests  the 
nature  of  the  problem.  "Adjust  the  headcount"  conjures  up 
the  image  of  cooking  the  data. 

Undercount  can  be  defined  and  estimated  either  in 
absolute  or  relative  terms.  If  relative  undercount  is  to  be 
reported,  reporting  considerations  suggest  a  change  of  the 
customary  definition  of  relative  undercount,  in  which  the 


denominator  is  the  unknown  true  population  value.  My 
proposal  is  that  relative  undercount  be  defined  with  the 
known  headcount  as  denominator.  Thus  an  undercount  of, 
say,  2  percent,  would  imply  a  needed  upward  revision  of  the 
headcount  by  2  percent,  rather  than  a  division  of  the 
headcount  by  1-0.02  =  0.98.  In  most  instances,  the 
difference  would  of  course  be  small,  but  if  users  are  to  do 
arithmetic,  the  arithmetic  should  be  as  simple  as  possible.  An 
alternative  would  be  to  report  absolute  undercount,  but  the 
use  of  percentages  of  estimated  undercount,  defined  as 
suggested,  gives  a  better  picture  of  the  adjustment. 

PUBLIC  POSTERIOR  DISTRIBUTIONS 

Bayesian  language  is  convenient  for  communication 
because  it  has  names  for  all  relevant  concepts.  For  this 
reason,  I  shall  freely  use  Bayesian  as  well  as  non-Bayesian 
terminology  in  developing  my  argument.  In  so  doing,  I  offer 
the  assurance  to  non-Bayesians  that  I  shall  not  advocate 
intrusion  of  private,  "nondiffuse"  judgments  into  the 
statistical  analysis  (which  would  be  objectionable  from  our 
present  perspective)  and  that  an  approximate  non-Bayesian 
translation  of  my  reasoning  is  possible. 

Estimation  of  undercount  (adjustment  of  headcount)  is  a 
problem  in  Bayesian  point  estimation.  The  optimal  point 
estimate  minimizes  expected  estimation  loss  with  respect  to 
the  posterior  distribution  of  the  quantity  to  be  estimated. 
The  practical  problem  is  whether  or  not  the  posterior 
distribution  of  undercount  can  be  regarded  as  "public"; 
that  is,  would  statisticians  be  in  essential  agreement  as 
to  what  it  is?  (For  statisticians  who  wish  to  avoid  Bayesian 
terminology,  I  would  ask  if  there  is  near  numerical  agreement 
between  their  confidence  intervals  and  the  corresponding 
Bayesian  credible  intervals.) 

In  many  applications,  the  posterior  distribution  is  public. 
Consider,  for  example,  my  example  of  estimation  of  un- 
employment. The  estimates  are  obtained  from  the  Current 
Population  Survey  (CPS),  and  are  widely  accepted.  (Con- 
troversy about  these  unemployment  estimates  has  turned  on 
problems  of  definition,  such  as  the  definition  of  a 
discouraged  job  seeker,  rather  than  on  the  central  process  of 
sampling  and  estimation.  There  are  also  second-order 
controversies  that  do  not  affect  the  main  point.)  The  key  to 
consensus  about  unemployment  estimation  is  the  probability 
sampling  design  of  the  CPS  and  the  consequent  availability  of 
what  I  call  an  essentially  unbiased  estimator  of  unemploy- 
ment in  the  sampled  population. 

(By  "essentially  unbiased,"  I  mean  an  estimator  derived 
from  a  probability  sample  that  may  technically  be  subject  to 
mathematical  bias— consider  the  typical  ratio  estimate— but 
that  is  not  subject  to  uncertainty  about  selection  bias  due  to 
such  sources  as  nonresponse  or  nonprobability  selection. 
From  the  perspective  of  "model-based  analysis,"  which  is 
now  distinguished  from  "probability  sampling,"  one  might 
find    similar    consensus    if    the    data    permitted    adequate 
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diagnostic  checking  of  the  specifications  or  assumptions  of 
the  model.) 

Further,  let  us  assume  that  an  estimated  standard  error  is 
obtainable,  and  the  sampling  distribution  of  the  estimator 
can  be  approximated  by  a  normal  distribution.  Then  comes 
the  essential  Bayesian  assumption:  We  assume  that 
statisticians  would  agree  that  the  prior  distribution  of 
unemployment  is  "diffuse."  That  is,  all  agree  that,  by 
comparison  with  the  evidence  conveyed  by  the  sample 
estimate,  the  prior  knowledge  is  to  be  accorded  so  little 
weight  that,  for  practical  purposes,  differences  in  that 
knowledge  can  be  ignored. 

The  assumption  about  the  point  estimator  discussed  above 
implies  an  approximate  likelihood  function.  (Technically,  a 
diffuse  prior  distribution  is  nearly  uniform  in  the  range  of 
parameter  values  for  which  the  approximate  likelihood  func- 
tion is  appreciably  nonzero.)  Then  the  posterior  distribution 
of  unemployment  is  approximately  normal  with  mean  at 
the  point  estimate  and  standard  deviation  equal  to  its  stand- 
ard error.  It  is  this  posterior  distribution  that  can  be  regarded 
as  "public"  in  the  sense  that  its  inputs-prior  distribution  and 
likelihood  function-could  be  agreed  to  (subject  at  most  to 
minor  quibbles  that  would  have  little  impact  on  the  numer- 
ical result)  by  all  statisticians. 

We  have  not  yet  come  to  the  Bayesian  point  estimate, 
which  is  derived  from  the  posterior  distribution  and  a  loss 
function  by  application  of  the  principle  of  minimization  of 
expected  loss.  For  any  symmetric,  and  otherwise  well 
behaved,  loss  function,  the  appropriate  Bayesian  point 
estimate  is  the  mean  of  the  posterior  distribution.  Tech- 
nically, the  estimator  that  is  unbiased  in  the  sampling  sense 
leads  to  a  Bayesian  estimate  that  is  unbiased  a  posteriori. 
That  is,  the  conditional  expectation  of  the  estimate,  given 
the  data,  is  the  target  of  estimation. 

APPLICATION  TO  ESTIMATION  OF 
UNDERCOUNT 

Turn  now  to  the  estimation  of  undercount.  Suppose 
that  an  infallible  postenumeration  survey  is  available. 
By  infallible,  I  mean  that  in  the  sampled  areas,  all  persons 
missed  by  the  census  are  located.  Hence  an  unbiased 
estimator  and  an  accompanying  standard  error  are  available. 
By  comparison  with  this  information,  we  can  regard  the  prior 
distribution  as  diffuse.  Then  the  posterior  distribution  of 
undercount  (overall  or  in  any  region  of  sufficient  size)  is 
approximately  normal,  with  its  mean  at  the  point  estimate 
and  standard  deviation  equal  to  its  standard  error.  The 
posterior  distribution  would  be  close  to  fully  public,  just  as 
in  the  example  of  the  previous  section.  Again,  the  non- 
Bayesian  point  estimate  is  the  same  as  the  Bayesian  point 
estimate  given  a  symmetric  loss  function. 

One  central  question  of  Fellegi's  paper,  and  of  the 
conference,  is  whether  a  statistical  estimate  of  undercount 
should  be  made  at  all.  The  alternative  to  a  statistical  estimate 


is,  in  effect,  to  estimate  undercount  at  zero  and  thus  stick 
through  thick  and  thin  with  the  headcount.  As  I  have 
formulated  the  scenario,  the  case  for  making  the  estimate 
(that  is,  adjusting  the  headcount)  would  be  very  strong.  The 
only  serious  counter-arguments  would  be  those  of  costs 
(extra  data  processing,  computation,  printing,  etc.)  or  bad 
side  effects  (public  confusion,  temptation  to  tamper  with 
"hard"  numbers,  etc.).  In  the  absence  of  such  considerations, 
the  principles  of  decision  theory  apply:  It  is  reasonable  to 
think  of  the  legislature  or  administrative  agency  as  a  single 
idealized  decisionmaker,  and  the  assumptions  about  loss 
function,  prior  distribution,  likelihood  function,  and 
posterior  distribution  are  about  as  compelling  as  I  can 
imagine  them  to  be.  The  statisticians  can  provide  the 
posterior  distribution.  A  symmetrical  loss  function  expresses 
the  statistical  equity  considerations  in  a  manner  that  should 
accord  with  legislative  intent,  which  is  primarily  concerned 
with  political  equity.  The  estimate  of  undercount  follows. 

Moreover,  the  same  reasoning  applies  widely  to  statistical 
work  directed  towards  public  policy,  quite  apart  from  the 
special  considerations  of  equity  that  are  so  prominent  here. 
If  truth  is  a  known  number  plus  an  uncertain  number,  and  if 
the  evidence  on  the  uncertain  number  is  of  the  kind  I  have 
sketched,  there  is  hardly  discretion,  at  least  from  statistical 
principles,  about  the  need  for  a  point  estimate  that  reflects 
the  available  evidence  about  the  uncertain  number. 

Suppose,  however,  that  the  standard  error  of  the  esti- 
mated relative  undercount  were  very  large,  say  0.10,  as  might 
be  true  if  the  estimate  were  based  on  a  very  small  random 
sample.  Few  statisticians  would  be  foolhardy  enough  to  stick 
to  the  adjustment  under  this  change  of  scenario.  It  is 
essential,  however,  to  recognize  the  rational  source  of  the 
reluctance  to  make  an  adjustment.  It  is  that  the  assumption 
of  a  diffuse  prior  distribution  would  be  hopelessly  un- 
realistic under  the  circumstances.  That  is,  the  likelihood 
function  would  not  be  "sharp"  with  respect  to  any  reason- 
able prior  distribution,  at  least  based  on  American 
experience.  It  is  not  a  reluctance  to  adjust  per  se,  but  a 
sensible  reluctance  to  adjust  when  other  evidence  suggests 
that  the  adjustment  indicated  by  the  sample  estimate  in 
question  is  inconsistent  with  other  relevant  information. 

The  key  to  the  problem  of  adjustment  is  whether  or  not 
the  statistical  evidence  bearing  on  the  adjustment  is  of  the 
kind  that  can  dominate  other  kinds  of  information  bearing 
on  undercount  about  which  statisticians  may  reasonably 
disagree.  (In  my  opinion,  this  type  of  reasoning  suggests  why 
statisticians  are  reluctant  to  adjust  for  nonsampling  errors  in 
ordinary  survey  practice.) 

If  we  consider  an  application  in  which  one  cannot  take 
the  prior  distribution  as  diffuse  with  respect  to  the  likeli- 
hood, we  cannot  hope  to  have  the  kind  of  expert  consensus 
sketched  above,  which  provides  a  support  that  statistical 
professionals  find  comfortable.  If  "comfortable"  sounds 
self-serving,  note  that  this  kind  of  "comfort"  has  heen 
provided     by    the     Census    Bureau's    long    emphasis    on 
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probability  sampling  for  public  studies,  which  contrast  with 
the  nonprobability  sampling  designs  employed  extensively  in 
private  commercial  surveys. 

This  simplified  scenario  suggests  that  statisticians  might  be 
able  to  agree  on  the  need  for  and  the  nature  of  adjustment  if 
the  evidence  were  based  on  some  clean  probability  sampling 
evidence  relating  to  undercount,  and  if  they  believed  that 
any  information  besides  that  of  the  sample  could  be  ignored. 

The  scenario  is  oversimplified.  It  would  be  completely 
realistic  only  for  an  infallible  postenumeration  survey,  one 
which,  in  sampled  areas,  located  and  counted  all  persons 
missed  by  the  census.  Even  in  this  example,  the  scenario 
would  apply  only  for  those  regions  in  which  the  sample  size 
is  large  enough  to  justify  the  assumption  that  statisticians  can 
agree  that  nothing  serious  would  be  lost  by  ignoring 
nonsample  information  about  undercount.  The  effective 
sample  size  of  the  postenumeration  survey  can  be  enhanced 
by  statistical  ingenuity,  as  is  illustrated  in  several  of  the 
papers  of  this  conference.  The  integration  of  sample  evidence 
coming  from  different  sampling  schemes,  such  as  post- 
enumeration  survey  and  reverse  record  checks,  is  probably 
within  the  range  of  statistical  technology,  and  this  integra- 
tion will  be  a  major  task  regardless  of  the  policy  taken 
towards  official  estimation  of  undercount. 

The  more  difficult  problem  is  an  allowance  for  the 
inevitable  bias  stemming  from  the  fact  that  the  post- 
enumeration  survey  or  other  sampling  schemes  are  far  from 
infallible,  being  to  some  degree  subject  to  the  same  sources 
of  bias  that  lead  to  undercount  in  the  census  itself  and  being 
correctable  only  imperfectly  by  demographic  analyses  that 
take  much  longer  to  carry  out  than  do  the  samples  themselves. 
Uncertainty  about  this  residual  bias  will  be  substantial;  there 
will  be  strong  opinions  about  its  probable  extent,  and  these 
opinions  will  probably  vary  from  statistician  to  statistician. 
As  a  result,  there  will  be  no  grounds  to  support  a  diffuse  prior 
distribution,  nor  will  there  be  substantial  consensus  about 
the  appropriate  nondiffuse  prior  distribution. 

However,  estimation  of  components  of  undercount  could 
be  useful  even  if  the  postenumeration  survey  were  partly 
fallible.  Suppose  that  undercount  consisted  of  two  mutually 
exclusive  components,  people  hard  to  find  in  the  census  who 
can  be  found  in  the  postenumeration  survey,  and  people 
impossible  to  find  in  either  because  they  don't  want  to  be 
counted.  A  postenumeration  survey  could  provide  an 
estimate  for  the  first  component  even  though  it  would  fail 
completely  with  respect  to  the  second.  It  would  be  useful  to 
proceed  with  estimation  of  the  first  component  in  line  with 
the  principles  just  cited,  even  though  the  second  component 
is  estimated  at  zero  despite  knowledge  that  the  true  number 
has  to  be  positive.  In  other  words,  estimates  of  one  compo- 
nent of  undercount  can  be  worth  making  even  if  other  com- 
ponents elude  completely  the  statistical  net.  (The  same  prin- 
ciple would  suggest  that  if  a  satisfactory  estimate  were  avail- 
ablefor  blacks  but  notfor  Hispanics,the  black  estimate  should 
be  used  for  blacks  and  zero  should  be  used  for  Hispanics.) 


LOSS  FUNCTIONS 

Fellegi's  symmetrical  squared-error  loss  function  is  con- 
venient, often  at  least  a  good  approximation  to  what  is 
needed  in  applications,  and  widely  discussed  and  used  in 
statistics.  I  am  not  sure  that  it  is  a  wise  choice  here.  If  one 
estimates  the  undercount  in  a  given  region,  the  estimation 
error  will  have,  under  many  allocation  formulas,  the  approxi- 
mate effect  of  a  proportional  and  unintended  transfer 
payment  from  or  to  the  /th  region.  Quite  independently  of 
allocation  formulas,  it  will  mean  unintended  gains  to  some 
people  and  losses  to  others. 

A  positive  unintended  transfer  payment  in  one  region  can 
be  associated  with  equally  unintended  negative  transfers,  in 
aggregate,  in  all  other  regions,  either  in  the  sense  that  less 
funding  for  the  program  is  available  from  a  fixed  total  or 
that  taxpayers  in  aggregate  have  to  make  a  transfer  that 
Congress  did  not  intend.  Similar  reasoning  applies  to  negative 
unintended  transfers. 

Spencer  raises  the  possibility  that  equity  of  subgroups, 
such  as  Hispanics  or  blacks,  must  be  considered  separately, 
although  in  his  formulation,  conclusions  may  not  be  sub- 
stantially changed  by  such  consideration.  My  tentative 
position  is  that  once  one  formulates  a  loss  function  that  deals 
with  unintended  transfer  payments  by  areas,  no  further 
consideration  is  needed  of  subgroups  within  the  areas. 

The  concept  of  transfer  payment  is  well  established  and 
understood  both  in  economic  jargon  and  practical  politics, 
and  it  seems  natural  to  express  estimation  losses  as  the 
absolute  value  of  any  transfer  payment  not  intended  by 
legislation  or  administrative  regulation  but  occasioned  by 
errors  in  the  census  numbers  used  in  allocation  formulas.  I 
agree  with  Fellegi  that  there  is  a  symmetry  in  the  practical 
consequences  of  errors  in  either  direction,  but  my  feeling  is 
that  loss  is  essentially  linear  in  dollars  within  the  ranges  of 
unintended  transfers  occasioned  by  census  inaccuracies. 

A  change  from  squared-error  loss  to  absolute-error  loss 
would  require  some  modification  of  Fellegi's  calculations 
and  make  them  less  tractable.  However,  I  believe  that  the 
essential  conclusions  would  be  intact. 

(In  tidying  up  technical  details,  the  problem  posed  by 
unbounded  loss  functions,  which  can  lead  to  paradoxes, 
would  have  to  be  considered.  My  attitude  is  that  all  loss 
functions  represent  approximate  assessments  that  must  not 
be  pressed  literally  when  the  model  suggests  that  paradoxes 
arise  by  so  doing.) 

INEQUITY  OF  TRADITION  CHANGES? 

Fellegi's  paper  and  part  of  Spencer's  presuppose  that 
headcounts  are  a  logical  starting  point  for  discussion  of 
equity.  I  think  that  this  presupposition  is  ill-advised.  One  can 
attach  loss  functions  to  inequities  that  may  be  created  by 
past  practice,  but  the  attempt  to  do  so  seems  academic  when 
there  is  such  general  agreement  about  the  potential  inequities 
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of  differential  undercount.  Headcounts  are  not  a  near- 
sacrosanct  starting  point  to  be  shifted  only  with  great 
trepidation  and  with  close  attention  to  the  distributive 
effects  of  the  change  on  areas  that  would  be  worse  off  than 
under  the  present  starting  point. 

Suppose,  for  example,  that  undercount  is  more  substantial 
in  some  race-age-sex  groups  than  in  others.  The  inequity,  if 
any,  would  arise  if  allocations  were  made  without  taking 
advantage  of  all  information  that  can  be  properly  digested 
statistically,  or  if  point  estimates  were  based  on  inequitable 
estimation  loss  functions,  not  if  there  were  simply  a 
departure  from  a  historical  practice  of  reporting  only 
headcounts.  Groups  that  "lose"  from  better  statistical 
estimation  are  treated  inequitably  only  in  the  sense  that  they 
would  be  deprived  of  an  historical  advantage  that  reflected 
limitations  of  statistical  methodology  used  in  the  past.  In 
principle,  the  costs  imposed  on  those  who  have  acquired 
what  might  be  considered  a  vested  interest  in  the  headcount 
could  be  incorporated  in  the  loss  function,  but  this  would 
raise  difficult  and  subtle  questions  that  are  dealt  with  in 
Spencer's  paper.  In  my  view,  these  questions  are  well 
bypassed;  the  argument  above  suggests  that  they  can  be 
bypassed  without  doing  violence  to  the  idea  of  equity,  and 
their  answers  would  not  easily  be  found,  as  Spencer  suggests. 
Moreover,  the  suggestion  made  by  Keyfitz  that  any  con- 
vention about  adjustment  be  agreed  upon  in  advance  would 
tend  to  shift  the  focus  toward  the  central  equity  issue  rather 
than  the  issue  of  how  things  would  be  changed  by  a  change 
in  estimation  of  population  from  the  census. 

A  PRACTICAL  STEP 

The  realization  that  there  are  important  remaining  biases 
in  postenumeration  or  other  matching  surveys  directed  at 
estimation  of  undercount  has  had  a  negative  effect  on 
thinking  about  adjustments  for  undercount.  In  particular,  it 
has  discouraged  exploration  of  the  full  potential  of  attainable 
adjustments  using  statistical  techniques  accepted  by  all 
statisticians.  For  the  purpose  of  dealing  with  undercount  in 
1980,  we  should  simply  accept  that  some  allowances  for 
undercount  are  possible  within  the  framework  of 
methodology  accepted  by  statisticians.  By  this  I  mean  what  I 
developed  in  the  preceding  sections:  Estimates  of  an 
important  component  of  undercount  can  be  made  by 
application  of  probability  sampling  and  accepted  principles 
of  statistical  analysis.  We  should  not  hesitate  to  make  these 
estimates  for  fear  that  allocation  problems  somehow  entail 
equity  considerations  that  pose  challenges  to  standard 
statistical  practice.  Some  of  the  papers  at  this  conference 
have  tackled  what  would  be  entailed. 

The  remaining  sources  of  undercount  remain  beyond 
present  capability,  but  not  necessarily  future  capabilities.  We 
should  encourage  work  that  deals  with  them,  whether  at  the 
level  of  improved  field  search  procedures  in  postenumeration 
surveys,  fuller  development  of  other  sources  of  information, 


more  efficient  methods  of  matching,  or  extensions  of 
Bayesian  methodology  to  what  the  paper  by  Dempster  and 
Tomberlin  calls  "second-order  undercount  assessment." 

DETAIL  OF  DISAGGREGATION  OF 
UNDERCOUNT  ESTIMATES 

Given  an  approximately  normal  posterior  distribution  for 
undercount,  any  symmetric  loss  function  serves  to  define  the 
mean  as  a  point  estimate.  There  is  a  different  question  for 
which  the  details  of  the  symmetric  loss  function  are 
important.  If  the  cost  of  obtaining  the  needed  posterior 
distribution  is  taken  to  be  nontrivial  (either  in  the  sense  of 
computation  or  collection  of  new  data),  we  have  a  design 
problem.  For  example,  should  we  estimate  undercount  by 
States,  metropolitan  areas,  counties,  cities,  towns,  minor  civil 
divisions,  tracts,  blocks,  or  what?  Fellegi  touches  on  design 
questions  in  his  paper,  and  Spencer  does  so  indirectly  by 
subtraction  of  the  costs  of  obtaining  information  from  the 
total  amount  to  be  allocated.  If  the  question  is  confronted 
formally,  and  if  absolute  error  cost  functions  are  accepted  to 
be  appropriate,  then  the  constant  of  proportionality  would 
have  to  be  assessed  judgmentally  in  order  to  make  any  formal 
analysis  of  design  questions.  The  question  would  be  some- 
thing like,  "How  much  would  Congress  be  willing  to  spend  to 
eliminate  an  unintended  transfer  of  one  dollar?"  Design 
questions  like  this  are  intrinsically  harder  than  questions  of 
analysis,  and  the  solution  is  likely  to  be  made  more 
arbitrarily.  The  Congress  may  simply  authorize  a  certain  total 
expenditure  for  a  postenumeration  survey,  and  the  Census 
Bureau  may  then  have  to  stop  disaggregating  when  the 
budget  runs  out. 

OTHER  BASES  FOR  STATISTICAL  CONSENSUS? 

Unfortunately,  it  appears  from  presentations  and  discus- 
sions at  this  conference  that  all  statistical  sampling  methods— 
postenumeration  surveys  and  record  matches— that  can  be 
applied  down  to  small  areas  and  yield  relatively  prompt 
results  are  more  fallible  than  I  had  imagined  them  to  be.  For 
example,  contrary  to  statistical  folklore,  the  post- 
enumeration  survey  may  be  less  thorough  than  the  census 
itself.  The  Canadian  matching  studies  reported  by  Fellegi  are 
much  to  be  envied.  This  is  the  basis  for  my  introductory 
assertion  that  there  appear  to  be  no  controversial  questions 
about  statistical  equity  but  enormous  technical  difficulties, 
much  greater  difficulties  than  I  had  believed  before  coming 
to  the  conference.  As  a  result,  it  will  be  impossible  to  have 
the  same  confidence  in  an  estimate  of  undercount  as  in,  say, 
the  estimation  of  unemployment  from  the  CPS.  It  may  even 
be  impossible  to  have  professional  consensus  upon  which 
estimates  of  undercount  (adjustments  of  headcount)  can  be 
based. 

Perhaps,  however,  consensus  on  posterior  distributions, 
while  sufficient,  is  not  a  necessary  condition  for  professional 
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consensus.  Hence,  I  have  begun  to  wonder  if  some  alternative 
basis  for  consensus  can  suffice  for  the  immediate  problem. 
The  discussion  of  synthetic  estimation  and  related  ideas,  such 
as  those  of  Purcell  and  Kish,  may  point  the  way  in  a  useful 
direction. 

In  the  synthetic  method,  we  start  with  national  estimates 
of  undercount,  based  on  demographic  methods,  by  groups 
defined  jointly  by  race,  age,  and  sex.  (These  demographic 
estimates  become  available  with  greater  delay  after  the 
completion  of  the  census  than  do  the  results  of  the  various 
sampling  methods,  such  as  the  postenumeration  survey  and 
record  matches.)  Suppose  the  national  demographic  esti- 
mates are  completely  accurate.  For  any  smaller  area  than  the 
Nation  as  a  whole— region,  State,  city,  block,  etc.— we  apply 
the  national  estimates  of  relative  underenumeration  by  race, 
age  and  sex  to  the  headcount  within  each  race-age-sex  group 
within  the  area. 

If  the  resulting  estimates  are  aggregated  across  all 
race-age-sex  groups  within  the  area,  we  obtain  the  synthetic 
estimate  of  undercount  for  the  area.  The  method  can  be 
applied  straightforwardly  down  to  the  smallest  areas,  as 
Robert  Hill's  paper  explains. 

What  kind  of  professional  consensus  might  be  obtainable 
in  support  of  undercount  estimates  based  on  the  synthetic 
approach?  The  decision-theory  skeleton  cannot  be  fleshed 
out  by  sample  evidence  bearing  directly  on  the  areas  for 
which  undercount  is  to  be  estimated,  nor  can  the  regression- 
type  methods  discussed  by  other  speakers  be  applied  to 
achieve  the  same  goal.  The  fundamental  support  for  analyses 
like  those  presented  by  Fellegi  and  Spencer  would  be  absent. 
If  the  Census  Bureau  were  to  estimate  undercount  by  the 
synthetic  approach,  the  resulting  efforts  would  not  have  the 
firm  technical  support  at  which  the  Bureau  has  aimed  since 
the  development  of  probability  sampling  methods  in  the 
1940's. 

However,  it  may  be  helpful  to  explore  further.  With 
respect  to  any  one  area,  a  synthetic  estimate  could  perhaps 
serve  as  the  mean  of  a  consensus  prior  to  distribution  for 
undercount.  If  further  information  about  the  area  were 
unavailable,  or  if  such  information  were  of  a  kind  as  to  leave 
a   statistician   undecided  even  as  to  the  best  direction  of 


modification  of  the  synthetic  estimate,  the  synthetic  esti- 
mate could  perhaps  serve  as  a  reasonable  point  estimate  of 
undercount.  If  the  area  were  relatively  large  and 
heterogeneous,  this  conclusion  seems  more  plausible  than  if 
the  area  were  small  and  homogeneous.  As  one  speaker  put  it, 
the  undercount  rate  for  blacks  in  a  wealthy  suburban  area  is 
likely  to  be  lower  than  that  in  a  ghetto  area  of  the  central 
city. 

But  the  evidence  presented  in  the  scatter  plot  in  Spencer's 
paper  is  disquieting.  This  shows  a  surprisingly  low  correlation 
across  States  between  synthetic  estimates  and  direct 
demographic  estimates.  The  two  methods  suggest  a  different 
geographic  picture  of  the  incidence  of  undercount,  and  the 
dispersion  among  States  is  substantially  lower  for  the 
synthetic  method.  Of  course.  Jay  Siegel  has  pointed  out  that 
more  serious  problems  with  demographic  methods  are  entailed 
at  the  State  level  than  at  the  national  level.  But  this 
evidence  does  not  rule  out  the  possibility  that  the  uni- 
formity postulate  of  the  synthetic  method  is  seriously  wide 
of  the  mark. 

CONCLUSION 

Although  allowance  for  equity  per  se  does  not  appear  to 
present  a  hard  problem,  I  find  no  easy  suggestions  as  to  how 
the  Census  Bureau  should  estimate  undercount  by  sub- 
national  areas.  The  traditional  statistical  path,  which  the 
Bureau  has  followed  in  its  sampling  studies,  is  limited  by 
serious  practical  problems  that  I  had  not  fully  recognized. 
The  synthetic  method,  or  something  like  it,  offers  a  possible 
way  around  these  practical  problems,  but  it  appears  to  have 
no  support  in  cross-validation.  Whatever  is  done  in  1980,  the 
Bureau  will  fall  short  of  providing  information  about 
undercount  of  a  technical  quality  comparable  to  that  of  its 
other  programs.  (Of  course,  undercount  looms  as  an  un- 
noticed shadow  in  the  background  of  all  family  and 
individual  surveys  conducted  by  the  Bureau,  including,  for 
example,  the  CPS.)  The  one  clear  advice  is  to  continue 
vigorous  research  into  ways  of  making  better  estimates  in 
the  future.  Here  my  feelings  are  suggested  by  the  clause,  "If 
we  can  put  a  man  on  the  moon,  ...." 


Floor  Discussion 


Following  Harry  Robert's  discussion  of  the  papers  by 
Fellegi  and  Spencer,  a  member  of  the  audience  asked  Mr. 
Fellegi  how  his  conclusion  about  adjustment  would  change  if 
the  measure  of  inequity  were  large  in  one  State  and  small  in 
another.  As  the  large  provinces  in  Canada  account  for  80 
percent  of  the  population,  how  does  the  geographic  situation 
come  into  consideration  in  making  adjustments  for  inequity? 
It  was  conceded  that  the  large  provinces  do  dominate  the 
measure,  but  that  was  not  thought  to  be  relevant  to  the 
decision-theoretic  framework.  It  is  relevant,  however,  in 
terms  of  the  sample  sizes  needed  and  the  allocations.  The 
optimum  allocation  turns  out  to  be  fairly  close  to  the  square 
root  of  the  population,  which  seems  to  indicate  that  the 
effect  of  the  population  is  less  in  the  overall  picture  than 
would  be  indicated  otherwise.  That  is  a  peculiarity  of  the 
Canadian  demography.  In  terms  of  the  inequity  framework, 
the  logic  would  not  be  different  if  the  population  distri- 
bution were  different.  The  kind  of  measure  would  take  the 
form  of  a  convention:  One  either  adjusts  or  not,  but  one 
does  not  adjust  only  for  some  States,  the  winners,  and  not 
for  those  that  would  lose.  Only  an  overall  measure  is 
proposed,  with  the  decision  being  made  ahead  of  the 
availability  of  counts. 

An  alternative  would  be  to  decide  on  the  basis  of 
individual  States  whether  to  adjust  or  not.  If  one  decided  to 
adjust  because  one  State  has  such  a  large  amount  of  inequity, 
one  would  adjust  for  all  States. 

It  was  suggested  that  perhaps  the  decision  not  to  adjust 
could  be  made  if  inequity  is  small.  If  there  are  a  lot  of 
"medium"  underenumerations,  but  none  tremendously  so, 
then  the  adjustment  should  be  made  if  inequity  is  reduced 
by  the  previously  agreed-upon  adjustment. 

It  was  noted  also  that  some  of  the  implications  from  the 
two  papers  today  contradict  Ms.  Slater's  conclusions.  For 
example,  is  it  Messrs.  Fellegi's  and  Spencer's  contention  that 
the  larger  the  underenumeration,  the  larger  the  benefits  that 


might  come  from  an  adjustment?  In  the  second  paper,  the 
point  was  not  to  look  at  the  black  population  per  se  but  at 
localities  with  large  concentrations  of  blacks  and  other 
minorities.  By  definition,  an  adjustment  would  move  closer 
toward  improving  and  making  more  equitable  the  flow  of 
resources. 

Illustrations  as  to  how  to  construct  a  measure  of  equity 
designed  to  model  legislative  intent  for  a  fixed-total  amount 
of  allocation  as  well  as  a  per  capita  allocation  formula  were 
given.  Neither  of  these  types  of  legislation  relates  to  a 
particular  subgroup  of  the  population,  but  rather  to  the 
amount  of  Federal  funds  flowing  to  the  particular  govern- 
ments. A  model  of  the  inequity  in  undercount  was  con- 
structed to  see  whether  the  inequity  could  be  reduced,  and 
this  can  then  be  tested. 

If  adjustment  is  done  in  the  decision-theoretic  framework, 
it  was  felt  that  increased  equity  would  be  achieved  overall, 
but  it  is  also  possible  that  some  groups  who  would  think  they 
would  gain  might  not,  depending  on  the  method  of  adjust- 
ment used.  The  decision-theoretic  approach  should  be 
followed  because  equity  is  not  fixed,  as  suggested  in  the 
Slater  paper.  Rather,  it  is  flexible  over  time  and  it  is 
unknown  what  the  uses  of  the  data  will  be  after  the  data 
have  been  produced. 

The  Bureau  commented  that  loss  functions  in  general  are 
being  considered.  There  is  no  reason  why  one  cannot  go 
beyond  the  50  States  and  treat  many  interested  groups,  and 
thus  mitigate  the  effects  of  the  undercount  for  those  groups. 
It  would  be  proper  to  discuss  the  problem  that  way,  if  the 
legislation  is  framed  so  that  the  amount  of  money  (per  capita 
or  fixed  pie)  that  ought  to  go  to  those  particular  subgroups  is 
specifically  indicated.  This  is  the  crossover  point  between 
political  and  statistical  equity.  It  is  not  the  statistician's 
function  to  second-guess  congressional  intent;  rather,  it  is 
their  function  to  see  if  the  intent  is  better  met  by  adjustment 
or  not. 
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Letter  of  Invitation  for  Conference  Papers 


<***'**< 


v 


\ 


UNITED  STATES  DEPARTMENT  OF  COMMERCE 
Bureau  of  the  Census 

Washington,  D.C.    20233 


As  the  1980  census  approaches,  there  is  increasing  interest  in  the  population 

counts  and  in  measuring  and  possibly  adjusting  for  any  undercounts.   Despite 

the  Bureau's  best  efforts  to  obtain  a  complete  and  accurate  count,  some 
undercount  is  likely  to  occur. 

Although  the  Bureau  has  been  active  in  research  concerning  the  undercount, 
we  also  are  anxious  to  encourage  as  comprehensive  a  review  of  the  undercount 
issue  by  the  general  research  community  as  possible.   A  2-day  conference 
has  been  scheduled  in  Washington  for  February  25-26,  1980,  to  examine  a 
broad  range  of  technical  concerns  regarding  the  issue  of  adjusting  for  the 
undercount.   An  agenda  that  will  accommodate  approximately  10-12  invited 
and  contributed  papers  is  being  planned,  with  discussants  for  each.   Attend- 
ance at  the  conference  will  be  limited  to  listed  participants  and  a  small 
additional  group  of  invited  persons  to  assure  efficient  use  of  the  2  days. 
The  papers  will  be  refereed  by  a  steering  committee  that  has  been  assembled 
to  plan  and  conduct  the  conference.   Authors  whose  papers  are  used  in  the 
conference  will  receive  a  $1,000  honorarium,  plus  travel  and  expenses  for 
participation.   The  proceedings  of  the  conference  will  be  published. 

We  would  like  to  invite  you  to  prepare  a  paper  on  any  of  the  issues  listed 
below  or  other  closely  related  topics  dealing  with  the  undercount  issue 
according  to  your  area  of  special  interest. 

The  issues  to  be  examined  include: 

1.  Methods  of  measuring  the  undercount  for  subnational  areas, 
including  the  quality  of  the  estimates  of  undercount  in 
relation  to  the  size  and  other  characteristics  of  the  area; 
and  the  feasibility  of  providing  accuracy  checks  or  con- 
fidence intervals. 

2.  The  timing  of  the  ad justment(s)  for  undercount. 

3.  Measuring  and  adjusting  for  the  undercount  and  misreporting 
for  factors  other  than  total  population,  such  as  social  and 
economic  characteristics. 
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A.   The  use  of  adjusted  figures  in  Federal  programs  and  the  impact 
of  adjustments  on  the  Federal  statistical  system. 

5.  Political  and  legal  issues  in  making  adjustments  to  the 
census  counts. 

6.  The  effects  of  adjustments  on  equity  in  the  distribution  of 
Federal  funds. 

7.  Decision  theory  and  theoretical  aspects  of  adjustment. 

Although  the  list  is  not  exhaustive  and  many  of  the  issues  noted  may  not  be 
mutually  exclusive,  they  represent  some  of  the  principal  questions  to  be 
discussed  at  the  conference.   The  acceptance  of  the  proposed  papers  will 
be  left  to  the  steering  committee  with  the  goal  of  a  proper  balance  in 
the  treatment  of  the  various  questions  at  the  conference.   In  order  that 
the  steering  committee  might  establish  a  list  of  authors  by  the  end  of 
September,  your  proposal  for  a  paper  should  be  received  by  September  15. 
Dr.  Conrad  Taeuber  (Georgetown  University,  Center  for  Population  Research, 
Washington,  D.C.  20C57)  is  a  consultant  to  the  steering  committee  and  the 
Census  Bureau  in  the  organization  and  management  of  the  conference,  and 
I  suggest  you  contact  him  directly  in  responding  to  this  invitation. 

The  Bureau  considers  the  conference  an  important  step  in  guiding  us  in 
dealing  with  the  undercount  issue,  and  we  look  forward  to  a  favorable 
response  from  you. 

Sincerely, 


(j/U^ivK^^ 


VINCENT  P.  BARABBA 

Director 

Bureau  of  the  Census 


Requestfor  Papers 


Distributed  to  selected  college  and  university  departments  and  at  the  1979  annual 
conference  of  the  American  Statistical  Association,  Washington,  D.C. 


Arrangements  are  being  made  for  a  2-day  conference  February  25-26,  1980,  in 
Washington,  D.C.  to  assess  the  feasibility  of  measuring  and  adjusting  for  census 
undercounts  at  different  levels  of  geography  and  for  selected  characteristics,  and  to 
discuss  possible  techniques  and  approaches  for  both  measuring  and  adjusting  for  the 
undercounts.  Although  the  Census  Bureau  has  itself  been  active  in  research  concerning 
the  undercount,  the  conference  is  being  organized  to  obtain  the  views  of  others  on  a 
broad  range  of  technical  concerns  regarding  undercount  adjustments.  The  purpose  of  this 
notice  is  to  solicit  papers  for  the  conference. 

A  conference  agenda  is  being  planned  that  will  accommodate  approximately  10-12 
papers,  with  discussants  for  each,  examining  such  issues  as  (1)  the  methods  available  for 
measuring  and  adjusting  for  the  undercount  for  subnational  areas,  (2)  the  timing  of  the 
adjustments,  (3)  the  feasibility  of  extending  the  adjustments  to  population  characteristics 
beyond  total  population,  (4)  the  use  of  adjusted  figures  in  Federal  programs,  (5)  the 
political  and  legal  issues  in  making  adjustments  to  the  census  counts,  (6)  the  effects  of 
adjustments  on  equity  in  the  distribution  of  Federal  funds,  and  (7)  decision  theory  and 
the  theoretical  aspects  of  adjustment.  Papers  are  solicited  on  these  or  closely  related 
topics  dealing  with  the  undercount  issue  according  to  your  area  of  special  interest  and 
experience. 

The  papers  will  be  refereed  by  a  steering  committee  that  has  been  assembled  to  plan 
and  conduct  the  conference.  Authors  whose  papers  are  used  in  the  conference  will  receive 
a  $1,000  honorarium,  plus  travel  and  subsistence  expenses  for  participation  in  the 
conference.  A  final  decision  on  the  acceptance  of  the  papers  will  be  left  to  the  steering 
committee  with  the  goal  of  a  proper  balance  in  the  treatment  of  the  various  questions  at 
the  conference.  In  order  that  the  steering  committee  might  establish  a  tentative  list  of 
authors  by  the  end  of  September,  your  proposal  for  a  paper  should  be  received  by 
September  15.  Proposals  for  papers  should  be  sent  directly  to  Dr.  Conrad  Taeuber 
(Georgetown  University,  Center  for  Population  Research,  Washington,  D.C.  20057),  who 
is  serving  as  a  consultant  to  the  steering  committee  and  the  Census  Bureau  in  the 
organization  and  management  of  the  conference. 
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Notice  Inviting  Papers 


Appeared    in    Amstat    News,    Number    58,    American    Statistical   Association 
(September-October,  1979),  p.  4. 


The  Bureau  of  the  Census  invites  papers  outlining  innovative  methods  for  measuring 
the  completeness  of  the  1980  Census.  Papers  will  be  reviewed  by  a  panel  of  consultants 
who  will  determine  which  ones  are  to  be  presented  to  a  conference  on  the  completeness 
of  the  census  early  in  1980.  Authors  of  accepted  papers  will  receive  $1,000  and  will  be 
invited  to  present  their  papers  at  the  conference.  The  deadline  for  receipt  of  papers  is 
November  30,  1979.  Manuscripts  and  inquiries  should  be  submitted  to  Conrad  Taeuber, 
Georgetown  University,  Washington,  D.C.  20057. 
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Appendix  C 


Conference  Program 

Location 

Sheraton  National  Hotel  Arlington  Virginia 

Registration 

6O0-9O0  p.m.  Sunday,  February  24, 1980 
800-900 a.m.  Monday, February 25, 1980 

Conference  Overview 

The  conference  will  examine  such 
issues  as:  (1)  Methods  available  for  measuring 
and  adjusting  for  the  undercount  for  sub- 
national  areas.,  (2)  the  feasibility  of  extending 
the  adjustments  to  population  characteristics 
beyond  the  total  population,  (3)  the  use  of 
adjusted  fiaures  in  Federal  programs, 
(4)  the  political  and  legal  issues  in  making 
adjustments  to  the  census  count    (5)  the 
effects  of  adjustment  on  equity  in  the  distribu  - 
tion  of  Federal  funds. 

The  conference  is  designed  to  provide 
a  forum  to  consider  new  approaches  to 
measuring  the  census  undercount  and  to 
assess  the  implications  of  adjusting  the  census 
counts.    The  conference  will  (1 )  bri ng together 
recognized  experts  to  present  and  discuss 
technical  papers  on  the  census  undercount, 
and  (2)  permit  an  exchange  of  ideas  on  a 
broad  range  of  undercount  issues. 


February  25 


PLENARY  SESSION 

9:00  -  9:45  am.  Introductory  Remarks 

Vincent  P.  Barabba     Robert  Garcia 

9. 45  - 1 0:15  a.m.  The  Census  Bureau  Experience  and  Plans 

Jacob  S.  Siegel     Charles  D,  Jones 

10.15  -10:30  a.m.    COFFEE  BREAK 

1 0:30  - 1 2:30  p.m.   Adjustment  Pro  and  Con 

Topic-Facing  the  Fact  of  Census  Incompleteness  -  Nathan  Keyfitz 
Topic-Adjusting  for  the  Decennial  Census  Undercount 

An  Environmental  Impact  Statement-Peter  K.  Francese 
Discussant-Robert  P.  Strauss 
Floor  Discussion 

12:30-200  p.m.      LUNCHEON  PROGRAM 

Topic-The  Congressional  Experience  -  Daniel  P.  Moynihan 

2.00- 3.45  p.m. 
CONCURRENT  SESSION  1- 

CHAIR-CONRAD  TAEUBER 
Methodological  Considerations 

Topic-Can  Regression  be  Used  to 

Estimate  Local  Undercount 

Adjustments?  -  Eugene  P.  Ericksen 
Topic-Modifying  Census  Counts  - 

Richard  Savage 
Discussant-William  G.  Madow 
Floor  Discussion 

3.45  p.m.  -  4 :00  p.m.  COFFEE  BREAK 

4:OOp.m.  -5.30p.m. 
CONCURRENT  SESSIONS- 
CHAIR-CONRAD  TAEUBER 
Methodological  Considerations* 

Topic-Diverse  Adjustments  for 

Missing  Data-  Leslie  Kish 
Topic-Some  Empirical  Bayes  Approaches 

to  Estimating  the  1 980  Census 

UndercountforCounties- 

RobertE.Faylll 
Discussant-Tommy  Wright 
Floor  Discussion 


CONCURRENT  SESSION  2- 

CHAIR-JOSEPH  W.  DUNCAN 
Impact  of  Adjusting 

Topic-Federal  Program  Impacts  of 
1980  Census  Undercoverage 
Adjustment-  Courtenay  M.  Slater 

Topic-Issues  on  the  Impact  of  Census 
Undercounts  on  State  and  Local 
Government  Planning 
HerringtonJ.  Bryce 

Di scu ssa nt-Wray  Sm  ith   F loor  Discussion 


600  p.m. 


RECEPTION 


CONCURRENT  SESSION  4- 

CHAIR-JOSEPH  W.  DUNCAN 
The  International  Experience 

Topic-The  Australian  Experience- 
Brian  Doyle 

Topic-Summary  of  Other 

International  Experience  - 
Meyer  Zitter 

Floor  Discussion 

*Late  addition  to  program  — 
A.  P.  Dempster  and  T.  J.  Tomberlin 
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February  26 


PLENARY  SESSION 

9:00  - 1 2:00  Noon   Other  Methods  and  Impacts 

Topic-The  Synthetic  Method.  Its  Feasibility  tor 
Deriving  the  Census  Undercounttor 
States  and  Local  Areas-  RobertB.  Hill 

Discussant-  Joseph  Waksberg 

Topic-The  Impact ot  an  Adjustmentto  the 
1 980  Census  on  Congressional  and 
Legislative  Reapportionment  -Carl  P.  Carluco 

Discussant-  Ben  J.  Wattenberg 
10=15-10:30a.m.    COFFEE  BREAK 

FloorDiscussion 

Topic-Legal  and  Constitutional  Constraints  on 

Census Undercount Adjustment-  Donald  P.  McCullum 
FloorDiscussion 

1 2:00  - 1 :30  p.m.     LUNCHEON 

PLENARY  SESSION 

1 :30  -  3.00  p.m.  Equity  Considerations 

Topic-Should  the  Census  Count  be  Adjusted 
for  Allocation  Purposes?-Equity 
Considerations-Ivan  P.  Fellegi 

Topic-Implications  ot  Equity  and  Accuracy 
for  Undercount  Adjustment 
A  Decision -Theoretic  Approach  -  Bruce  Spencer 

Discussant-Harry  V.  Roberts 
FloorDiscussion 

3:00- 3:1 5  p.m.        COFFEE  BREAK 

PLENARY  SESSION 

3=1 5  -  5:00  p.m.    Recap  and  Concluding  Discussion 
5:OOp.m.  ADJOURN 
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